ED 254 162 



DdinJMENT R£SUME 



HE 018 127 



*TITLE 



IHSTIT 



fUp DATE 

NOTE 



Statistical Sampling Handbook for Student Aid 
Programs? A Reference for Non-Statisticians. Winter 
1984. 

, Office of student Financial Assistance (ED), 
' Washington, DC. > 



EORS PRICE 
DESCRIPTORS 



llip. ? For related docuBients, see HE 018 112-135 and 
HE 018 137-140. * ^ ' 

Guides - Non-Classro^>m Us^ -'(055) Tests/Evaluation 
Instruments (160) ' , • 

"mF01/I^C05 Plus Postage. , ^' . 

Administrator Guides; College Students; 'computation; 
♦Federal Aid; Government Employees? Higher Education; 
♦Prediction; Program Adrai«|st rati on; *Records 
(ycqasfilL Sample iFSize; *Sanyiing; ^Statistical . 
AnalysisfV* Student Financial Aid; Student Financial 
Aid Offi^rs 

♦Office pf Student Financial Assistance 



II)|^TIF!ERS 

ABSTRACT ' . : . ' - 

A manual on sampling is presented to assist audit and 
program "reviewers, project officers,* managers, and program 
specialists of the U.S. Office of Student Financial Assistance 
(OSFA). For eaqh of the following typ^s of, samples, definitions and 
^ examples are provided, aXong .with information on advantages .and 




included. Thre^ examples of the potential uses of sampling statistics 
and the forms by OSFA are proy^ded, and potential problems that could 
arise are addressied. The forms are used to: calculate the- estimate of 
the population variance from. a Sample; develop population estimates 
frofflsa si'fflple random sample; determine minimum neaessar^ sample • 




- jampling 

forniulas and symbols, a IS-itep annotated bibliography, and an index 
•by primary reference or definition. (SW) - • 



y ♦ 



* 



Reproductions supplied by EDRS are the best that can be «$de 

from the original document. * 



Office of Stjident Financial Assistance /U.S. Department of Education - 



f\f. 



ERIC 



t. 



SCOf'g OF iNTERSST NOTICE 
^ doouTi^i tor pf occwlr^ 




Winter 1984 



is al»o of ^wmn to it>6 Clwf k>fr« - 

^ thCHikt r«f«ci thaw 9(m^ 
pomtt of vi«w 



U ' 



pting Haiidtxxdi 
f or Stiid^ Aid 




A Reference for rion-StatistiGians 





2 



.1 



4- 



r 



STATISTICAL SAMPLING HANijBOOK 
iOR STUDENT AID PROGRAMS " 



A Reference for Non-Statisticidns 



i 



Prepareii by 
M^ion ot Quality Aesarance 
Office of Student Financial Aasietance 
Office of Pofltsecondaty Education 
Departn^t of Education 



WlliTER 1984 



ERIC 



3 



TAStE OF CONTENTS 



WHEN to SAMPLE ■ 

Introduction . . , . . . . . . . . . , , 

Using the Mafuial . . . , . 

^The Advantages and Oisadvants^es of Saoipling 
' Judgmental and ^tatfstlcal Sampling . . . . . 

THE LANGUA^ OF STATISTICAL SAMPLING 

Introduction ................. 

Population and Sample,. . ...... . ♦ . . 

Sampling. Error 
Point and fnterval Estlniates 
Confidence Intervals, Confidence Levels, 
, and Confidence Limits ♦ . . . . ^ ... . 

Sunmary , . . 



* • « ft. # • • t' # • 

? 

• 9 « • » « 



« i « • # • f 



■ * • • # # 9 

. « m ft fi • « • . 



TYPES OF SAMPLES 

Introduction 
Simple Random Sw^^ng 

using a random .number table to draw 
a simple random sample . 
Staratlfied Sampling 
Cluster Sampling 
Systematic (Interval) Sampling 

Drawing a systematic sample from 
filing c^inets 
Dot lar-Unit Sampling 
Sequential (Stop or Go) Sampling . . . ^ . 
Discovery (Exploratory) Sampling . , ♦ . . 

.Mul±l*Stage Sampling , , . , , , , , . , , 

Opportunity sampling ' ..... . . . ; . . 

Quota Sampling . . . . 

Summary.. 



f • ' « '# • ' * • • ■ -m 



TABLE OF CONTENTS (Conticiue<l) 



Chapter . page ' 

4. . CO>fi»UTING SAMPLE STATISTICS 

- IntroductloiT ...*.,.......,,.. 35 

Form A: Estlmatiig the Variance of a 

Population from a Sample^. . . . . 36 

X Form 8; Developing Population Estiraates from 

a Simple Random Sample ....... 38 

Form C: Determining Sample Sizes .. . . . . 41 

Form D: Using a Calculator tfo Confute the , 
estimate of the Population Variance 
. from a Sample ... . , , , , , " . . 47 

. Form E: Using a Calculator to Compote 

Population Estimates from a .. 

Sifi|)le Random Sample . . . . , . . . 48 



5, APPLICATIONS OF* SAMPLINfi TO;,STUDENT FINANCIAL AID 



Introduction . . . *^B[,. . . . . . . .. . 51 

Exariiple .1: Review of'SEoB Awards at 
- tMlversity A .......... -52^ 

Exanple 2: BEQG Applitant Quality Control 

S^udy ft • i# » t • • a • , • ^ ^ 60 
Example 3: Review of CMS Audit Report from 

. University B ......... . 61 

APPENDIX A: INTRODUCTION TO SAMPLING STATISTICS 
. i ■ " ' ' . ' 

Introduction ................ Al 

Total, Population Size, ^tean, Variance, 

Standard Deviation and Distribution .... . A2 

Attributes A7 

: Population and SainpTe Syni)oTs . . A9 

,/ Estimating the Population ftean ....... All 

Estimating the Population Total and , 

' Standard, Devi atloft . , . . - . . . . . . A12 

Statidard Errw of the Ptean y ....... . A13 

Confidence Interval for the Mean A15 



TABLE OF COHTEl^TS (Continued) 



APPENDIX B: 
APPENDIX C: 



» • IT « 



,Sroan Samples . . 

Confidence Interval for the Total .. 
The Relation Between the Conf1denC€i 

Interval and the Confidence Level : 
The 4leiatilon Between Sample Size M 

Population Size ... , . . ^ . , 
Self-Testing Revievf i ... .... 

Ansmers to Self-Testing Review . , 

SAhPLINS SYMBXS AND FORMltAS 

BIBLI08RAPHY • , 

INDEX 



J ■ 



For many research ai^ audit purposes, It is hot practical. or possible 
to collect data on all the cases In the population under studvAat only 
on some fractional part called, a sample. A sapiple is. chosen tp represent 
the totaV population frora ^^lch . It Is drawi. If properly constructecf, 
sample characteristics. c art provide the basis for making* valid statistical , 
inferences aibout the tot^^^ \ * 

Sampling 1$ ndt the sole province of stfatlstlclans, but o<:curs*1n 

■ ^hundreds of ways In everyday Jife.. The cook sanipTes soup to correct' the 
seasoning; The eleisentary school teacheri?tif zzes students on a sample of 

; ar1 thine tic problems to-deteRraine the sfeidents' overall mat^i^atlcal 
ablT'1t1e§. The political pollster saji^jles the electorate to predict tiie 

/outcomes of elections. / ' . 



I 



ERIC 



mm 



iVIrtuilTy every ^tvi^ton of the Office of. Student F'inanctal 
•As$.l5tarK:e employ.^^ 'However, three, . 

/grouf^s w11:h1fi OSFA are particularly frequen,t users "of samp Te data. These 
three\ groups* ar^ audit and .program reviewers in the Division of , . 
Certl^katlen^^ and Program Review (OCPR) and project officers, managers 
and .pjrigram' specialists In "a variiety of di vis fofl^^ • ■ 

' The^OCPR aidit reviejcns are recipients of sampling based audit dBtf 

,which «thi^ review, analyze and sunwarlze for reports. In addittoa^ they . 

^4re. oftert>ca lied upon to defend their understanding of the data, when . > 

recipient\institutfons cHallenge audit findings. These tasks re(ju>ire the 

afatlHty ,to\deve^ estimates from sample data and a basic 

underst^fj^. of statistical sampling and Its application to program 
auditing. ' * . ^ 

tacH year f)CPR conducts hundreds , of program reviews at' recipient 
institutions. During the course of the reviews, a selection of cases at 
each institution ai^ examined for compliance with program procedures". 
Although, in general, formal statistical procedures are mt employed In 
selecting cases for examinatlon.prograro reviewers could ^benefit from a 
basic knowledge of statistical sampling techniques. Such k rrow ledge would - 
permit reviewers to ^ploy statistical sampling procedures where / 
appropriate as well as to advise Inst itittfons on methods of condtit ting ^ 
Internal reviews.' 

(Project officers,, managers, and program specialists coo^rise the 
thiwf group of frequent sample data user's. The sampling needs of raemb^trs 
of this group are l$ss specialized and more general ,tH an those of ihe 
other two groups. -Coniaon tasks requiring knowledge of statistical 
sampling Include: designing sampler for quality checks; selecting samples 
of students and/or Institutions for research studies; and reviewing 
sampling plans and sample data submitted by contractors. * ^ 

-i6f«S THE MANUAL • 

V . • ; \. ^ ■ • ■ 

. This manual 'Is designed to address the statlstjjfeal sampling needs of 
th© groups listed above and OSFA as a i<hole. Because almost all 
potential users of this manual are not statisticians, mathematical 



erJc: 



8 



expps\ltton and tecHnlcal lar^uage have been kept to a^ minimum, the 
(namfai Gonc«rt»^ates on explaining. In clear and npninathenmtical language, 
the issues raised 1n sampling, the utility of san^le data, and methods of 
calculating required sample sizes and making population estlii^tes from 
sample iiata. flanual chapters have Jjeen»writt§n so that, to the raaximutn 
extent possible, each chapter stands by itself. This format allows use 
of the manual both as a reference source and as a self -teaching* 
introduction to sampling. v * 

The manual is*^ divided into tvo -pr.fraary sections. The first section, 
cttapters one, two, ^nd ^hree, introduce the advantages' and disadvantages 
of sampjlingpthe termihology of sampling, and the maior types of . 
samples. The focus of these chapters, is conceptual, not mathematical. 
The second section, chapters four ami five and appendices A, B, arA C, . 
jntroduce the statistics of sampling in several different, and largely 
independent ways. Chapter 4, Computing Sample •Statistic^, contains a 
series, of forms to aid in calculating a variety of coraron Sample 
stattstics. - Chapter 5, Applications of Sampling to Studeit f inancial 
Aid, give^ examples of statistical sampling uses in OSFA and illustrates 
the use of thfe forms coatafned in chapter 4. Appendix A, introduction to 
Sample Statistics, is an optional resource for those who want a basic 
introduction to the mathematics of sampling. Appendix B summariies basic 
samjjTing formulas and symbols. Finally, for those Who would Tike a 
fuller fxplanation of sampling statistics or more advanced of specialized 
statistical information. Appendix C presents a short aftnotated 
bibliography. ' i 

• ■-■ . ^ ^ ■ ' ., ^ ■ ' ; ■ 

THE AOVANTA^S AND DISADVANTA^S SAMIINS - 

» . . .• ^ . . 

■ . if ' > 

In many cases, drawing a statistical "sample has a wide range of 
potential advantages over examination of the entire population under 
study. ' . / ' / 

• Reduced costs . Ffflr laarty^of the needs of QSFA auditors and. 

pipograw refvi ewers, it is not practical or possible to examine • 
all the student files in an institution under review. In such 
cases a sample of student aid files can oft^n produce the 



necessary tnformatfon a fraction of the cost of a 10056 ' . 
review. Slfpfiarly, OS-A researchers tafi- often statistically 
describe the total population of student financial assistance 
recipients on tlie basis of sample data when a complete census of 
. o . all recipients would-be prohibitive ly^expensive; * 

• Reduced Respondent Burd«) . Sampling allows for shorter reviews 

. and less disturoance of r^vi^wed agencies than would be possible 
with 100% reviews. / . . 

• greater accuracy through better 'quality control . When a sniail 
. numDer of records ts oetng reviewed, data collection, analysis 

and suffmarization may be more carefully supervised and 
controlled than might be possible in a review of all the * ^ ' 
docuneots, required in a 100j£ review. i 

• greater range 'Of information obtainable . Sampling permits the 
researcher/ auditor, or program r^i ewer to examine a.wi^er 

, range of topics than would generally be practical if a 100« 
review were required for every topic* addnessed,- 

• Faster^ reporting of results . The time required for data 
collection and sumnartzatlon can be greatly reduced through the. 
use of sampling. 

The- use of sampling, however, is not always approljr1*te, or without 
difficulties. The primary reasons for no-t using sampling are; 

■ . • 

• Sampling results in Incomplete knowledge . By its very nature, 
sample data can only p^^duce estimates of population 

. characteristics. For example, from a sample of student v 
financial aid records, it is possible to astimate the total 
dollar amount of^tudent aid overpayments. It is not possible, 
however, to determine the, exact dollar amount of overpavwent or 
which students received overpayments and which did not. 

Sampling introduces potential bias. When a sample is not ' 
' correctly constrttct«i» the conclusions based on- the sample can 
.be biased. For exan^le, the famous 1 iterary Digest poll that 
predicted Alf Landon would defeat FrarJcJin Roosevelt for 
president in 1936 reached* tit e wrong conclusion because of a 
faulty sampling technique. Individuals the saraple polled were 
selected out of telephone directories. In 1936, telephone 
, subscribers mre among the more prosperous of the voting r 
population and. as a groups predominantly support©! the 
Republican Landon. % . ^ 

• .Sampling produces only aggregate statistics . If information is 
" needed for every individual In the population, sampling is 

. inmroprUt^. _ V- _ : : ; • 




4 

10 




Sfflpllnq^can be 1 na^propr fate »hen studying smil populations 
\^vmer igo cases) . when>the number of ft les to. be renewed, 
records to be' aid i ted or population to be studied, are small ' 
enough It mj^y become logisttcaUy and adraini strati vely simpler 
. not to san^le and to do a ♦complete review. -* * 



In suiimary, sampling can be a useful tool for reducing review, audit « 
nd resetarch cost-s, and respondent burden;, for raalntajning hl^h quality 
control; for expanding the scope of research; and for* speeding the . 
reporting of results. ' Sampling, however, is 'not appropri^^te \*en exact ' 
knowlet^e of a. population characteristic is required* or when tnforip^tion 
is needed for indivicMal cases in ttie populltion. Faulty sampling can 
pnjituce br^sed results, ^^cause sampling Introduces Certain ' 
complications fn audits, program revlenf and fesearch effdrts, its 
advantages do not always outweigh its disadvantages. This is , 
particularly truk when studying small populations. /, 

JUDGMENTAL AND STATISTICAL SAMPLING . 

Once the decision has' been >made to sample, it must be decided whether 
formaV Statistical sampling procedure^ should be followed in, fconstructing 
the sample. Ther^ are' two priroapjj types of, samples: judgmental (or * 
purposive) and statistical (or probability). For a judgmental sample, 
cases are chosen for study on thebas|| of the' selector's icnowle^lP^bd 
experience. For* a statistical sample, one or another variation of raruloBr 
Selection is employed in choosing cases for.stut^.- In Chapter 3 the 
ffl^ijor 'types of Judgmen'ta'l ami statistical samples are reviewed. Here the 
more general que^ ion of the relat1\(ie advantages and disadvantages of 
statistical sampling versus judgmental sampling* Is discussed. 

-7-~ ^JUD6M£NTAL SAMPLINS 

Judgmental sampling' can be an efficient. method of locating cases of 
interest . For example, a program reviewer tn«y have learned from 
iixperiejKe tb^ jir^icedurAl err^(^^ ^re. mare.Iii^Xy to^e found in thicks 
dog-eared* files than in thin, clean files. Therefore^ InjA-revlew to 
discover whether procecfural errorslxist, selecting only thick dog-eared 
files may be 'mijre' efficient than statistical sailing. 



^uilgmgital sagipltng can be more- efflct^t t^aty statistical samplW , 
In descrlbffi q a populat1on;on the basis of very small samples . R^learch 
has shoMi iudgraental sample selectfon Is . most effective en the Sample . 
Is small (eight or less); when the population sampled Is small and 
visible, or known to the selector; and \rfieh the selecior h^as, great and 
proven skill In this art* — • ' - • 

Judgmental s amples can- involve f ewer comp 11 cations' than statistical 
^sampHn^. . Jud^ental sampling can ellmlnate^elaborate statistical 
• sampling procedures and analysis. 

} ■ ' ' . i 

' .STATISTICAL S AMPLIf<S / * ' . ' * 

Statistical sampling 15 superior to judgmental sampling in a number 
of jery^lrapo^tant w^s. Results of statistical sampling are objective 
and .defensible. Because statistical sampling rests- on demonstrable, 
mathiematical principles, the result^ are objective and tJef ens lb le bel^ore 
rev leviers; recipients, and even courts. Questions of bias or bad 
judgment vrfiich can b'e raised against judgmental samples can be eliminated 
through statistical sampling. • 

Results of statistical sampling provide a sound basis for drawing 
inferences about the total population from ^Ich the sample was drawn .. 
' For example^ e;^arafnation of a statistical s^le of student aid f 1 les In 
a university could provide the. basis for* estimating' the total number of 
aid ovet^j^yments in tha^t diversity, tn contrast, a judgmental sample of 
thid? or dog-eared' files, while It- might be efficient in locating 
particular errors,' could not serve as the basis for estimating the total 
ni«*er or sUe of'ePrors. This Is because judg^tal saraplet violate the 
jtatistical Drinclples which make possible projections from a. sample to^a 
total population. \ ; ^ 

Statistical saroixling provides an estimate of sampling error . For 
judgmental, samples there is no w^ of knowltg whether two different - 
amples are likely to produce the^same or widely divergent results, 
re also no method of determining how 'sample results are likely to 
pare with tfje results that would be obtained from a 100 percent review 
of all cases In the population. Statistical sampling, however, produces 



» 





:es of sampling error, for example. If a simple rar^ooi statistical 
of *20O loan records out of 2,000 'student loans made by a single 
' I6'iid^f found 20 of the loans delinquent, it wmild be possible tP estimate 
'that, the chances are 95 In* aMOO that the total number of delinquent^" 
' 'loans;for the lender wwild be soroeiwhere between 12l and 279. (Chapter 4 
presents an explanation of how such estimates are calculated.) 

... ' • , , ^ ' ^ ■ ■ ■ 

V r ' ■ ' , ■ ' ■ \ ■ , • • _ . " ■ ' • . ■ ." 

Statistical sampling p'rovides an objective means of determining 
necessary sample size to meet program review^, audit or research 
purposes . Returning to the previous ^^W'^t ' assume that the Federal 
reviewer determined that it' Is necessary to estimate the number of , 
delinquent loans fpr the lereJer within a range of plus Or minus 50 \ 
loans. To achie\r^ this level of accuracy, it is possible to determine in 
advance the necessary t^ftniraum; sample size of 433 cTases. *^ 

statistical sampling results may ife confined and evaluated even wften 
onchicted in "different locations and by different individials. As an 
example, results of Statistical., sample-based audits, independently „ : 
conducted by different aiditorSy on vari^ campuses of a single 
university system, can easily be cffinbineiPnd analyzed^ to prodtice 

estimajtes for an entire university system. Because procedures for , ( 

. . , . , ^ _ , ,. . ,^ ^ , 

selecting judgmental samples inevitably vary tc^Laudftor tO' auditor and 
from circumstance t^ circumstance the resuHs of judgiiental seldoin can be ^' 

■ comlnik^;. ' ■ ^ : " 

^tatjlstical sampling is flexible enough to incorporate most of the 
advantages of jud^entai sampling . Statistical samples .can be tailored 
to particular needs and circ*WBtances in a great variety of ways which 
-allow incorporati<»i of the editor's, pt-egram rtJieipr^s or researdter's 
knowledge arid experience. If, for example, a progre^ reviewer has 
advance knowled^ ^h at thicft file^ mre mrt likel/ to coi|tain errors, a 
stratified statistical sampling meth<^i<could b§ developed which gives 
thick files -s higher lil^e 11 hood of; being selected than thin files. Such 
A^ampiiag'^thiid^^^i^ advantages of -judgmental sampling 

whil^ l^talningjalf^^ 



ERIC 



0^ 



A; 



V Thfe chofce between judgmentai-. and stattstlcaT sampling i4st be ©aJe 
on a case by case 'basis. Judgniental sampling Is effective *hen thfr . 
giajHpiTe and th^ population s^ainp led 4re very small and visible or well, 
l^nown t6 the se1ec!br and the Selector has skill and experience 1q 
drawing such san^les. S'tatlstlcaV sampling Is superior •where objective, 
defensible 'i^su If s are required or where projections to the total 
population are to be made or where the san^)le size is tweity-flve or 
lapger. In rao'St cases, the advantages of statistical and Juijlgmental 
sampling can be cdmbined by tailoHng statistical sampleS'to 
circunstances and needs. . 



•7 




i 'I 



statistical sampling has Its own special language* The language Is 
conjposed of coinon English words which ane given special tneanlngs, 
letters f ran the Roman and the Greek alphabets and mathematical 
notation. Although, at first view, this language can be Intimidating, 
the basic underlying concepts are very straightforward. This section 
Introduces the basic language and concepts of sai^llng. 



ERIC 



15 



P(^ULAnON AND SAW»L£ ' " ' f 

Tj^e dictionary's first definition of poj»ulat1on is all the people In 
ya country or region. In statistics, the term population Is used'nmch 
>re broadly to mean the total set of itans, persons, f 1 les etc* from • - 
ilch a sasqjTe Is taken. Items which con^ose a population could be 
Ind^dual students, receipts, j^ples, universities* files or any ..other 
set dKientltles to be studied, A sfflnple Js any portion of the population 
selected to be studied. , ; ' ' 

A ^^llng unit Is. a selected Item or^case from or about whlclr' 
Infiormatlon Is sought. It Is often possible fo»* the sampling unit ^o be 
defined a niiber of different ways In the same area of study, '/o^ / 
exan^jle, in an audit of student financial aid, sampling unit could be 
defined as the Individual recipient or as each financial aid award. * ' 

A measure which describes a population is called a par^terw Fbr 
exapiple, if the population un^r study Is a year's BEOS awards In'a 
particular university, the nianber of »rards, the total dollar anrount of 
awards j and ithe 'average amount of the awards In the university could all 
;be parameters. A statistic is a characteristic df a sacple. If we were* 
to draw a sample of 8E0G awards In the university, the^ number of awards 
In the sam^tle, the total dollar value of the san^le awards and the 
average dollar value of san^le awards^are all statistics . When we make 
generalizations about a population on the basis of 5a#le data, we are 
using statistics to estimate par^neters. To help maintain the 
distinction betn^n parameters and stat1stffes, Sreek letters such as 

<T , and t are generally used to denote parameters and lowercase Reman 
letters to derote san^le statistics. Table A 7 on |Jage A 9 sumaiarlzes 
the basic symbols and formulas used In ^i^atlstical sampling. 

I ■ 1 

SAMPtlHG ERROR . * ' 

Estimates of population parameters* can be calculated frorajsampling 
"Statistics. By 1t% very fiature, sample data can produce QiJly estima^ " 
of populatidn parameters. For example; froiw a sample of student aid 



10 



16 



files 4r a untversity, It Is possible to est»1mate the percent of, 
fjroceduraV discrepancies in the total university. It Is not posstisTe to 
deterinine the exact nundjer of files' containing prxedural discrepancies 
ijn the university. This means that sample-based estimates of population 
characteristics are always subject to error. Sanipling ertor is judged in 
two wa^ys— bias and reliability or prec^isiciri, 

A biased saipp ling schetne is one which, on repeated triads, produces 
average estimates of a population characteristic which differs from the 
true value. An instance of a biased sample would be selection of cases 
thaj an auditor has a<(vance reason to believe have overawards. 
Projecting -the results of such a sample to the total population of grant 
recipients wou Id tend to systeniatical ly overestimate the average dollar 
amount of overawards. 

The second major ty^e of sampling error is^due to limitations in 
sample reltability or precision. Very sma-11 samples are particularly, 
^rone to lbw reliability. For example, a random saml^le of 5 student 
files used to estimate the total amount of student loan oiverawards In a 
large university would not be biased because it would rr^ot systematically 
under^ or overestimate the true amount of Overaiwnis. *='%uch a Sanpl6 
woiUd, however, have a very^ low reliability in that additional samples of 
' five student fi les are 1 1kely to.|irodiK:e very different population 
estimates. - 

The difference between. bias and reliability can be illustrat^l by 
considering the performance of four guns, A» B, C, and D. lach gan has 
b«n f'ired at a target twenty t1ma§. The resulting patterns of hits and 
misses is shOM> in Figure 2.1. 



Qun A has very low reliability in that Its shots*are spread all over the 
target. H6iifeytir, yurrA is not biased in that Its aim is not 
systanatic ally' low or high or off to the fight or left. Gun S has both 
low reliability ;anct high bias because there fs a wide spread of its shots 
and, on the average, its aim Is low and to the right. Gun C has high 
reliaibility; its shots all fall in a very lifflited area. However, its aiffl 
is biased high and to the left. Only gun 0 has both high reliability and 
no bias as shown by the fact that all the Shots are very close to the , 
center of the target, „ 

POINT ANO INTERVAl E^imTES , ' ^ ; 

Wh^ ^fample data is used to produce a s'^ngle estimate of a population 
parameter the estimate is known as a point estiniate* Sample based, 
single value estinates of the number of procedural ^iscrepanqies, in / 
student aid files, the average level of overawards, and the number of 
delinquent loans are all examples of point estimates. Because 
saraplerbased estimates are inherently subject to a certain error, point 
estimates seldom exac^rty match the parameter v^ich they ar« used to 
estimate.. Therefore, it- is a common practice to make fnterval estimates 
as well as a poiht estimate of a parameter. 

An interval e stimate is one which specifies a range of Values tather 
than a single value. To say that the number of procedural errors falls • 
betweai 120 and ISO or^at the rate of delinquent loansMs 4 percent 
plus. or minus 2 .percent, or that ^e average amount of BEOG is $500 plus 
or minus $50, is to make an interval estimate. 

CONFIDENCE INTEBVAIS, CONflDENCE LIMITS, AND CONFIDENCE lEVfLS 

An interval e^sliraate of a popthlatioo parameter is called a confidence 
Interval and the end points of the interval are known as confidgice 
limits . To understand how statisticians use these terms, it is necessary 
to define an addf t lonal 'terra--conf 1 dence level .^ fonffdence level 1 s the 
level ofvprobability associated with an ^tervaT estimate; it is an 
indicator of the degree of certainty that the particular method of. 



r 



estlBiatlng the confidence interval will produce an estimate which 
Incliidei the true: oopuUtton v*Tu«, The higher the C(mfi4mce level ' 
associated wltH. an interval estimate, the njore certainty there is that 
the method of estimation wi 11 produce an estimate containing the true 
value. As an example, if we were to- s^ "On the has Is of a random sample 
of 400 cases, with a 95 percertt confidenca, level, the number of 
delinquent loans* for a lender falls between 45 and 55 percent," what we 

k would be arpuing is that if repeated samples of 400 cases each were drawn 
from the same population, 95 percent of the estimated confidence : > - 

i» intervals w>u Id contaift the true percent delinquent studwit loans. 
The point to remaii)er is that the confidence level refers to the >^ 
. procedure used, in drawing the sauple and in est iuia ting the c^id^ce 
• interval rather than to any particular interval. Therefore, it is an 
error to make such statements as "The probability is 95 percent that the 
percent of delinquent loans is between 45 and 55 percent." /, 



4, 



SlUMARY ' 

This completes our introduction to the lang^iage of sampling* 
StatisttcaV sampling, of course, includes many more terms than have been 
reviei«d in this chapter. However, the basic terms and concepts reviewed 
are sufficient for understanding the advantages and disadvantages of the 
various sampling designs reviewed In Chapter 3; ami fdr use of the basic 
sampling formulas introduced in Chapter 4. '"^ 



14 



20 




of SAlViplE^ 



• There are many types of safflpTes. No single type Is superior In all 
circumstances* This chapter discusses the major optlins the researcher, 
auaitar and prograin reviewer have in constructing a sample , The major 
t)^es of samples, are defined, one or (uore exan^les of each are given and 
advantages and d1 sadvan tages of each type are discussed. For tw) of the 
fli^st coBWonvtypes 6f saniples, slinple randan and systeriatii:, sections are 
Included on how to construct the sample. The types of samples presented 
are 'not laatually exclustvej they cm he co^lned in various vra>s. It'ls 
possible, for example, to draw a multi-stage, stratified, cluster, 
sequential, dollar-unit,' discovery sample. 



ERIC 




■ Stniple Random Samplfng . 

... , , , , , V 

If a sample is drawn from a populatton tn such a way 
that every possljile sample containing the same nuinbem 
of cases has the saif^ chance of being selected, the > 
sampling procedure Is called simple random sampl Ing . 
The most cQfonon wa^ of drawing a simple random sample 
Is to assign all cases In a population a number and 
then select cases by the use of a random, number table. 

Aft»r a program. review by a large unlvers'^^y'was 
re<ju,1red to conduct a sampl6-based audit of NDSL, SEOG 
and CMS awards- It was determined th^t to give a ' 
confidence level of 95 percent and a Tel lability of 
+ 2 percent, assuming a rate of error In the records ' 
of jiot over 2 percent, a minimum saiiiple size of 137 
per program was necessary. To obtain at least 137 
students In N0SL, SEOS and CWS,-a tdtal Sj^le*of 300 
studeit aid recipients wa,s drawn from the university ' s 
financial aid computer file using a random nui^ef 
table. lo the resulting sample there were 163 CWS \ , 
recipieots, 169 BEOS recipients and 160 fipSL 
recipients? In this case, a single! sample was able to 
serve the multiple purposes of reviewing awards In 
three programs. \ 

Simple random sampling produces unbiased estimates of 
population parameters and the results are '^e easiest 
to euwlyze of all statistical sampling methods. 

Simple random sampling requires a complete listing of 
all cases In the population sampled and, In general. 
Is less precise, given a fixed sample size, than ^ 
stratified sampling , * 



16 , 

22 



Use of a random table nainber tot draw a sample random sanlpTfe r T^je roost 

' cownon sjejyiod of drawing a random saRple is through 
• use of a random nuBibei^ table, "Jables of 'random 
numbers are created in such a W that the integers of 
' ■ , '0 through 9 all have an equal probability of bccurring 
• . in any position on the table, ptt digits appear on « 
V page In a random fashion* Tabll 3.1,^^ 

is an example of a randdn number table. - 

To illustrate use of* this table, we will draw a simple 

' randOT sample of 50 stti(lents from a jjopulation 73 

student aid recipients at a j3art1(;ular university. 

Firstly we take the list of student aid recipients and 
„ number them from 001 to 735/ Second, i^ s^^^ 
^ starting point on the table. To do so, I sinply 
*' ■ ; 'Closed ay eyes and stabbed the table With my pencil. 
. The first try missed the table altogether. The second 

' . try laired on the '1' underlined on the table. S1m:e 
I we need three-digit numbers (001 to 735), we will 

!r . consider tfje 'I' to be the first digit and the '2* a*^^ 
' \ * '0* digits directly following it to be the second and^ 

i third digits. The first number Is therefor* *120^ 

So studwt nuntoer 'UO' is selecti&i for inclus^^^ 
« the sample. We tiiert read down the column to find, the 
next sample i6i^r C366). Reading dc^ the ^column we 
select 'SW', '147', and *321'. However, then we con« 
to '827*. Because 827 is not a number-on our student 
- list we skip it and cdntlrwe (town the column tarrtit we 

coifs to the next three .digit number between 001 and . 
, 735 Inclusive. " \, 

We continue dptii. the threenllgit column selecting 
eligible numbers, then shift to the next coliBwis of 
digits reading as far as necessary to draw a samplje of 
50 studeits. Table 3.2 Includes the actual Ttst of 50 
eligible random three-digit numbers seleci^d. 

I ^ 

. ' ■ ' / r ■ ■. 

, 17 



TABLE 3.1: RANDOM Hmm 



. 6t 


81 


17 


50 


68 


00 


35 10 


30 


90 


59 


71 


09 


95 


01 


14 


78 


95 


64 


65 


24 


82 


14 05 


27 


63 


33 


96 


10 


41 


8? 


70 


84 


28 


44 


68 


07 


47 


21 47 


56 


§^ 


32 


87 


28 


40 


40 


50 


92 


33 


63 


98 


99 


22 


09 21 


97 


T8 


10 


03 


79 


46 


17 


13 : 


IS 


79 


75 


50 


29 


36 


1J2«^37 


63 


39 


02 


47 


57 


02 


97 


17 


80 


16 


09 


75. 


22 


28 


35 25 


53 


57 


72 


64 


09 


98 


63 


50 


- 68 


20 


33 


03 


43 


73 


80 96 


21 


T3 


97 


61 


—98^37' 


35 


77 


55 


26 


^5 


04 


30 


60: 


68 10 


73 


53 V 


89 


35 


53 


45 


. 83 


23 


60 


00 


37 


51 


42 


89 


52 V 32 


46 


00 


57 


02 


71 


97 


^M.. 16 ^ 


59 


69 


31 


20 


16 


37 


66 34 


99 


76 


07 


23 


40 


85 , 


64 


91 ' ' 


84- 


42 


37 


15" 


,58 


54 


17 16 


45 


73 


67 


20 


09^ 27 


90 


96 


57 


46 


65 


19 


78 


34 


57 12 


U 


45 


"54 


65 




17 


' 30 


90 


.78 


17 


51 


4f 


69 


22 


41 48 


01 


99 


66 


4S 


00 


28 ^ 


21 


74 ' 


27 


66 


33 


21 


49 


11 


24 15 


33 


70 


06 


95 


j04 


67 


98 


56 


82 


54 


98 


H7\ 


81 


86 


77 35 


*%7 


56 


32 


72- 


.60 


90 


26 


75. 


33 


06 


79 


71 


73 


57 


96 74 


85 


94 


36 


97 


87 


79 


82 


00 


^77 


94 


61 


11 


69 


.61 


78 78 


36 


51 


4$ 


21 


82 


94. 


39 


22 


. 87 


15 


:49 


66 


56 


55 


34 99 


.05 


26 


45 


35 


59 


83 


55 


47 


24 


98 


52 


45 


79 


85 


15 ^ 67 


32 


21 


29 


94 


98 


90 


02 


27 


05' 


66 


15 


23 


83 


66 


24 98 


06 


75 


60 


69 


64 


ZB 


58 


24 


84 


90 


70 




01 


'36 


90 78 


56 


.40 


61 


00 


53 


40 


75 


37 


49 


50 


30 


^ 




38 


'70 10 


,80 


71 


12 


54 


60 


76 


62 


.13 


27 


53 


95 


4/ • 


§4 


/78 - 


61 85 


56 


15 


71 


76 


25 


31 


96 


39 


56 


17 


07 


83 


96 


/29 


88 39 


67 


86 


-98 


23 


U 


03 


82 


62 


41 


67 


05 


42 


29 


hs 


54 76 


71 


82 


04 


81 


82- 


63 


00* 


23 



TABLE 3.2: SELECTEO SAM»LE CASES. 




163 






585, 


214 


,677 


692 • ^ 


721/ 


. 178 


491T\ ' 


209 


534 


012 


515 




380 


624 


In- '\ 


068 


690 




,417 


510 


047 


457 


V.405 - 


^91 


241 


• 147: 



19 



ERIC 



25 



When we examine the Itst of 50 mu^ers, we see that 
the number 147 been' drawn twice. For^ siniile 
random sample, the technically correct procedure In 
such cases is to count the data collected from student 
nuiitfjer 147 'twice In the analysis. Jh is procedure is ^ 
called sampling with replacaront . In actual practice, 
.iTOSt researchers simply draw add ItlonaT cases until 
th^ reach the desired sample size, and count each case 
once 1ft the "analysts. This procedure is calTed 
sampling v^lthout replaceggnt . 




5. ^ 



4* 



■'Vv 



strati f f edi'SawpHng 



MlWoni ^. ^ y^^^ ^stratmed samplci ys ond obtained by separating the 
X . pojjulatlon Into nonfivii'lapping groups caU^ strata 
, . and then selecting a simple randoin sample from each 

stratum. ' r 

: . . .. ; , ; , ■ ■ ,;•/> ; , \ . * 

For a review of NOSt loans iriade by a major lender, the 
.: rev1ei«r divided the population of loans Into two 
primary strata; loa^ to students* currently In school 
and 1 oahs to students who, are no longer In school . 
The second stratum was/further subdivided into-Joans 
^ In the grKe period, loans that have been repaid, • 
delinquent loans, and loans In the process of 
repaimient. From each of the resulting five strata, a 
single random sample was drawn, aiKl .the selected cases 
were reviewed. This prxedure guaranteed the 
^ rev1e»«rs an adequate numfae^ of loans within each 
stratum to make objective statanent.s. about members of ' 
the 'stratum and "make projections to the total. , ' 
population of loans. 

Stratified sampling produces unbiased population 
estimates and It Is more precise than single ramlom 
saaipHng given a fixed. sanple size. Stratified: 
sampling introduces - a great~ deal of f lexibility Jnt© 
statistical sampling. Mendiers of groups of special 
Interest can be given a higher probability of being 
saiapled than' menters of groups of Tow Ihterest. The 
selector's priGr knowledge can be Incorporated 1n the 
sample design. . 

Stratified s^ljng reQu1fei_ 4dv aiKie lamwledgejjf^-tte 
proportion of the population In each stratum; 
othemise, the precision of the sample is decreased, 
lyitsts of cases In each stratuw are not callable, 
s^atlfied sampling may not k possible. , 



Exanjle; 



•\. 



Advantages: 



Disadvantages : 



7 



21. 



ERIC 



27 



Cluster Sampling 



A clyster sample is a stnple randonj sample in which ' 
the sampling units are col lectiorts, or clusters of 
cases. . , ■ , ;'■ • • 

For a,rev1ew of the CWS program in a* univers.fty, a ~ 
team of program reviewers sampled work- studiy time , 
sheets by selecting three weeks at random and then 
reviewing all the time sheets for each^of the three 
weeks selected. Because the time sheets were selected 
In groups rather than indivfduaTly this sampling 
method is called cluster sampling. Cluster sampling 
Is cooinonly used when the population to be sampled is- 
dispersed over a wide geographical area. ' For example, 
to reduce travel costs for a foltow-up study of 
student loan defaults, several cities- and towts were 
selected ^t random and then loan recipients sampled 
witt^in the /selected areas. - * 

Cluster samples produce unbiased estimates of • 
population parameters. They- require a Jisting of only 
those cases included in the selected clusters. In 
many circumnances cluster sampling is more ' 
cost-effective than other methods. For personal 
interview survfeys conducted over a wide area, travel 
costs can often be subsljintially reduced by 
clusteriRQ, fctfien. complete pojiulatlon lists are not 
available, clustering can .reduce costs in sample 
selection* 

Clustering generally produces /less precise population 
estimates given ^ fixed sample size than simple ra»>dom 
"or ^trat i f 1 ed samp 1 fhg . - tt i s iisa al Ty no t po ss 1b 1 e to 
determine in advance' the minimum sample size required 
to achieve a piredetermined level of precision. * 
Computation of popuTattort estimates from cluster 
samples can be quite complex.. ' ' 



28 



■ Systanatlc (Interval) Sampling 

Systematic sampling Is a method of select Idn whereby 
sample cases are dravyn from a population at some fixed 
interval but where the startfng case is selected at . 
• random, ' , • , 

a - . ■ , ■ ■ " 

For^an audit of BEOG awards an auditor deterroined that 
a rainifflufli sample of 100 , awards was necessary. The 
_ university being audited had a validation roster 
containing 1,172. The audit reviewer divided 1,172 by 
100 to obtain 11.72, which he rounded down to the 
nuD&er 11. Selecting the randoin number of 3 as a, 
starting fwint, he selected for review the 3rd name on 
the validation roster, the 14th name (3 + 11), the^ 
25th name {14 4- 11} and so on until he had worked 
through the entire list. . 

Often systematic samples are the eaCigfit type to ^ 
construct. They do not require a complete listing of 
all cases. In most circumstances, when the cases are 
ordered at random, or in- alphabetical order, the 
resulting ^an^le is unbiased. Certain methods of 
ordering cases such as date of loan or size *of. loan, 
can introduce implicit strat if icatiort into the sample 
and thereby increase precision. 

Systematic soling is not asable whoi cases are 
missing from the flies or records. Periodic ordering 
of file^ or records can introduce bias into systematic 
sampling. For example, if a particular record system 
added a new fi le for every work day, any systematic 
samplejif, the, f ilea which had 15 ' as a f^tpr^f its 
sampling interval would result in selection of c||es 
all frpra ther same day of the week. ^ 

23 ■ 



4^ 



'29 



Draotng a systemltlc sample from filing cabinets ; If the program under 

review can pravlde access to filing cabinets 
containing the files of student aid recipients, 
systematic satnp11ng froffl the filing cabinets ma^y be 
appropriate* 




Procedures for sampling fi'om filing cabinets 

Deter Drfna.|{ie total number of student aid 
recipients ^" 

Using form C o«i page .41 determine the 
inlnlnium necessary samplelslze • 

01 vide line 1 by line 2 



Line Number 



H - 

n « 
N/n • 



J2) 
J3) 



Truncate line 3 to an 'Integer (For example, • 
1f line 3 equals write '17 • on line 4) 

Sampling Interval : ■ . ■ , 

Enploying Table 3.1 on page 15 select a 
randcBj nurflberWtween 0 and 9 , 

Random Number *■ ■ , ! .. 

Using Table 3.3 below select ^ starting number. 

Start with the top drawer of the first fllliig cabinet. Count 
file folders until you get to the starting number on line 6. 
Select this case for fev lew. • ' 

Starting with ttie last tile selected, count forward the nt 
of files specified by the sampling Ittterval {line. 5).' Se^ 
file obtained for review. ^ 



Repeat step 8 until you have worked your way thrOCigh-all the^r 
files. 



.(4) 



.(5) 




24 



1 ". 



ERIC 



30 



TABtE 3.3: RANDOM START FOR INTERVAL SA«aiNi3 



Sambltna 
Interval 
(Une 4) 


0 


1 


2 


Number on Line 5 
3 4 5 6 


f 


0 

Q 




■ -2 ■ 


2 


2 


2 


1 


1 


2 


2 


7 


1 


o 


3. 


2 ' 


1 


2 


3 


t 


2 


1 


c 


C 




4 


1 


2 


4 


■• ■•2 . 


1 


3 


1 


i 




1 
1 


■ '5 


4 


2 


4 ' 


2 


5 


'4 


3 


1 

it 


a 


i 


6 


6 


5 


- 3 


1 


5 




2 


fL 


., 




7 


/ - 5'' 




7 


5 


4 


4 


2 


C 


i 


c 


8 


5 


6 


8 


S 


2 


2 


S 


c - 


51 


7 


9 


y 

It 




6 


6 


. 5 . 


3 


9 


-J 


fl 




10 - n 


6 


5 


1 


3 


5 


3 


7 




0 


C 


12-13 


7 


1 


6 

W 


H 


12 


4 




a 
o 




ft 


14 - 16 * 


9 


r. 


g 


8 


3 


10 


2 


■ 

o 






17 - 19 


15 


11 


17 > 


11 


6 


17 


6 




In 




20 - 24 


7 


10 


17 


14 


9 


'5 


7 


4 


5 


8 


25 - 29 


1 


23 


25 


5 


5 


11 


4 


34 


10 


6 


30 - 35 


30 


9 


8 


6 


17 


19 


21 


29 




25 


36 - 41 


- 32 


36 


ylS 


28 


13 


3 


22 


17 




33 


42 - 49 


25 


37 


12 


30 


30 


38 


26 


31 


9 


t1 


50-57 


18 


16 


9 


15 


31 


8 


24 


17^ 


23 


58 - 69 


:-'4' 


29 




14 


3 


6 


35 


31 


25 


38 


70-01 




T 


14 


19 


11 


52 


44 


45 


53 


10 


82-99 


78 


28 


17 


76 


74 


60 


37 


34 


8 


13 


100+ 


79 


■84 


18 


43 


72 


^4 


15 


33 


59 


n 



Startlng^.Nunijer 



.(6) 



25 



ERIC 



DoTlar-Unit Sampling 



When the sampling- unit H defined as an iij^ivldual 
dollar rather than an Individual loan, grant, etc., 
the sample Is calTed a doT'lar-un It sample . . • 

Sampling loans or accounts tends to give equal *e1ght 
to each loan or account* For mary purposes, a one 
hundred dollar loan should not be counted equally with- 
a ten thousand dollar loan. This is particularly true 
when the goal of the ludi tor or reviewer is to 
estin^te dollar amounts of overpayments or 
.discrepancies. One solution , to th*s problem is dollar 
unit sampling. Rather than treating loans, grants or , 
recipients as sampling cases, dollar unit sampling 
uses the dollars involved as the sampling unit. -As an 
illustration, consider ah audit review of a university 
which administered 483 CWS awards totalling $724,500. 
The university supplied the auditor with a list 6f 
award recipients and the dolj^r amount of each loan.; 
Because the awards varied greatly in amount, the 
auditor decide to conduct a systematic, dolTar-unlt 
sample. Having "determined that a sample si« -of 142 
was necessary-?© her purposes^ she divided 724,500 by^ 
142 to arrive at a^saropling interval of S102. She 
then selKted a random starting point of .1^7 in the 
first intefval. ' 

ktorking her way through the list of loans, she made a 
running table of the amount of loans on the listing, 
selecting for review the loan containing the 1847th 
Jol l_ar» the 694$th dollar^ 11847 .+ 5102) tiie 12051th 
dollar {694.9 + 5102) and so on. Using this metJ^Kl of 
sampling, eaclj^Toan had a probability of being 
selected directly proportional to its siie. Thus a 
loan of $1000 had twice the chance of being Selected 
as a loan of- $500. \ ' 

• 22 ' _ 



Adv&nta<Ks; DolUr^unit sampling produtes un&i^sed population 

f mates. It produces mpre precise est linates of 
dollar aawimt population parameters than loan, grant, 
or recipient saaptlng given a fixed sample size. It 
Is an effective way of locating large errors clustered. 
In large accounts that are almost li^josslble to detect 
by Account sampling. FImIIj^, the problems of 
^ converting error frequencies into dollar anwur^ts for 
p<>pulat1,on projections are avoided. 

Disadvantaqes; . Oollar-unlt sampling produces less precise estimates 

of error frequencies In a fjopulatlori than loan, grant, 

recipient .sampling given a fixed saijgj^lze. In 
many cases, the data on account sliTl^qulred to draw 

a do?lar-un1t saiii>Te are not available In advance. 



Sequential (Stop or Go) Sampling 



Deflnttlon; 



Example; 



Advantages ! 



In sequential sampHng on the basis of a sfntnwn y 
Initial samp Te/ decisions are mad 6 as to whether it 1s^ 
necessary to saniple additional cases, and tf so, what 
type of cases should be sampled. 

during a program review, a team of reviewers drew a 
mln Ifflura In 1t1 al san^le of 50 $tudertt f 1 Tes from a 

population of 14,187 student aid recipients. From a 
review of the Initial sanqjle they found ho errors In 
(^SEOG and NOSL awards and grants* However, they 
discovered a 'subs,tantlal nunter of discrepancies in 
CWS awards. ^ On the basis of this Information, they 
decided to terminate their ^dit of SEOG and NDSl and 
to draw an additiona! sample of 50 €WS awards. 

Sequ^tlal sampling allows minimization of sample 
size, does not require advaruze knowledge of population 
distrtbutfons, allows jnodlff^ to 

tke advantage of knowledge gained during the previous 
quence and adds flexibility to discovery, sampling 
designs. ^ 

, Sequential s^lfnf m^ be vei:y time consuming teause 
it requires that the sau^ling process, be periodically 
halted to analyze data gathered; 



Discovery (Exploratory) Sampling 

; Oeflnition ; ■ Olscovery safflpllng is a sanptlno design used to locate 

' • * . / . examples or establish a maximum rite for infrequent 
, ^ ' occurrences. Discovery sampling is a metHod of giving 

assurance to an auditor or program reviewer t^^at if . 
; sonfe critkat event , has occurred with some rainito 
* . ■ / frequency, the sample will contain at least one 

* ' exanple of this event. 

gxample ; * Suppose an auditor wished- to examine EO.OfK) grant 
- . A vouchers for possible cases or fr^ud. To assure that 
/ there were no cases of fraud he/she would, of course, 
* be required to examine, all 20,000 vouchers, which ' 
' " might not be practical. One alternative to a 100 \ 
^ . - percent review would be to maniple enough cases* to 
assure that if fraud did exist at above a certain 
level or rate the auditor wi 1 1 have reasonable 
certainty of discovering at least one case. If the 
auditor drew a random sample of 3(00 votichers, he could 
be assured at a 95 percent confidence level, that if 
fraud occurred in 1 percent or more of the loans at 
l«Kt one case of fraud would be . irfclud^ in the 
sample. Therefore, if an examination of the sample 
vouchers revealed no examples of fraud, the auditor^ 
could reason^ ly conclude that ^en ff fraud did 
occur, it occurred in' less than 1 percent of the 
loans. Discovery sanpling is «often used with 
sequential sampling. After review of the initial 
sample a decision is made concerning the need for 
iexamtnation itf,^ilitiona^l eases, - ~ - - 

Advantages t Of ssovery ssnpling is an effective method of 

jtafill^hing iQ^iimflD rates for rare but significtott 



• j 




29 



M«Hi-$tage Sanipllng 

^faH1 -Stage sampling is a process of selecting a ' 
sample in two or niore successive and cwtingent stages. 

There is no complete listing of college students in 
the United States and therefore a simple random sample 
of college studCTts is not possible. One way to draw 
a representative sample w<Hild*be to first sample 
colleges .which do have complete student rosters • and 
then sample students attending the selected colleges/ 

The precision of the sampling desi^ could be in^froved 
by first stratifying colleges by such variable^ as 
size, type, and geographic Ideation and then sampling " 
frbro each stratum. Within colleges,' the student 
population could also be stratified by y6ar-in-school, 
enrollment status, sex, r«:e, and so on. 

At times, samples can involve mar^ stages. A recent 
survey of elementary school chi Idren first sampled 
schpol districts, then elefi«ntar^ schools within the 
selected districts, then classes within the selected 
schools, thei students within the selected 
classroons. .At every stage, the sample was stratified 
and wet^tt^ to improve precision. ' 

Multi-stage sampling introduces a great deal of 
f lexibi lity into sample design and ^es poss ible 
sampling of populations for i^iph there are no 
complete lists of cases. It also can incorporate the 
advantages of stratification and clustering in a 
single sample. 

^tolti-rta9e •sasqylinf ri 

Into data analysis . Estimation ^f confidence 
intervals for a multi-stagfi sample usually re<|« ires 
kmswled^ pf advanced statistical procedures, v 



31 



Opportunity Sfflipling 

, ■ , ■ ■ i : ■ . • 

/ ■ , ■ 

/ "' 

Opportunit y sanpHnq Is the selectiofi of ssciple cases 
tn a, haphazard way.* The selector takes an opportunity 
sample »(^en he selects any»case he happer^s to run 
across for inclusion In the sarapla* • * 

Saraples of the first twenty files in a cabinet of : 
student financial aid recipients for review is an 
. exanple of opportunity saropl ing, A commn version of 
opportunity sampling is "roan-qn-the-street" Interviews 
conducted by television, radio and newspapers as an 
infomal wasure of public opinion on current events. 

Opportunity sampling is an easy sample selection 
method because it imposes no constraints on which 
cases mj^ be selected. 

Opportunity sampling potentially introduces bias into 
the san^l^ because there is no way of Jbeihg certain 
that the sampled cases are truly reflective of the 
^otal population sarapledv 

For exaraplb, selection of the first twenty student 
files in a cabinet m^y result in a s»p1e limited to 
recent aitf recipients. Wan-in-the*street« interview 
conducted during working hours may exclude working 
people from the sample. Therefore, the results of 
opportunity sampling cannot provide the basis' for 
objectiA^r>oject1on of sample results to the total 
population. ^ 

' { • 



Quota Sampling 



Ofiflnltlbnj 



ExampTe ; 



Advantages ; 



Disadvantages ; 



1^ 



When a pre-spectf led number or quota of smple cases 
are selected on an opportunity basis from the various . 
groups or categories which compose the poputati in , 
under study, the sample 1 s cal led a quota sample . A 
quota sample Is a stratified opportunity sample* 

Iff a researcher wished to ^aqiple the student 
population of a university In which 30 percent were . 
ffeishmen; sophomores, Juniors and seniors each 
elomposed 20 percent of the students; and 10 percent 
wre graduate students, he might select the first 30 
freshnjen, 20 sophomores* 20 juhlors, 20 seniors and 10 
graduate jtudents leaving the student union, , 

Quota sampling Increases the representativeness of 
opportunity sampling and thereby potentially reduces 
bias* ■, \, . ' ' 

(^ota sampling rests on the false>assumptfon that 
laanbershtp in a ci^tegory automatically q^^^^ 
a case to represent all members of that ~ 
category. Quota sampling Is subject to selection 
b f ^ and has unknown , stat 1 stf cat properties • 
Therefore, the results of quota sampling cannot 
provide the basis for djjectlve projection of 
saBpletesults to the total poptilatlQR. ' 



33 



ERIC 



39 



Choosing the Right Sample Design 

Thei^e are no %iinple rules far choosing the optlaaT saji^te design that 
apply In all circuaistances. Howvfp, several basic guidelines can be 
used. SiRiple random, or systematic sampling are npstveffectiv 
complete lists of the population exist and the researcher desires a 
sampling design that "lends itself to sinpl? analysis. Stratified 
sampling is advantageous ifien the researc^r wishes to assure inclusion 
In the sample of certain sobpopulations or to inci^ease the efficiency of 
simple random sampling. .Cluster siuspling can save costs for personal 
i11ter\^4e^^(^suh^ey conthicted over a wide area,*^ or *men complete popuTa^on 
lists ar^ not available. Dollar unit saa^ling is advantageous when 
aiditing a system of records containing marty small and a few targe 
accounts. Sequenti al and discovery sampl ing are most u^f ul when 
investigating a population about which there is little advance knowledge. 

Ultimately stu4y goals^and resources dictate choice of sai^le 
d^ign. The great variety of §an^le design cliqices pemit tayloring of a 
sample to many different stuely purposes and bud^ts. 



UTiNqSAivipU 

STATJSTicS 



this Chapter contains a series of forms to aid In calq^\at1ng a ^/ 
wiety of comnon sample statlkks. V^orm A is for caloiUtlng 
estimate of the population variance from a saiuple. Form B cab be used In 
developing population .estimates from a simple random ssnple and Forsi ^ Is, 
for detemlnlng ml n Iqjutn necessary samp te sizes . E ach of the forms 
contains a step-by-step procedure for calculating these important sample 
statistics. . ' 

. . In addition to these hind calculation forms/ two fon?s have been 
provided for ctmputlng these statistics with a calculator. Fonu 0 shmfs 
the procedure for using a calculator to dfet^ertBlne pofwlation variance, 
and Fo,rm E, for developing population j|st1fflates. It should be noted that 
while a {Salculator can b€^ used for all the sample statistics described in 
this manual, efficiencies can be galnlid in Its use for calculating 
population variance and population estimates. 




■ ■ • f 




- ; Estimating ^ Varianciil of a Population 
^ 4f fromaSanriplels^) 

" " ' ' ' I (Xr 3g)^ - 2 Xf-(S X,)Vn ' " 

. : - ' UNE 
[ ' §31?: NUMBER 



A. How many cases are In tha sampfa? 
, a. Subtract 1 from Una 1 ' ^ 

' . ' ■, ■ • ; ■ ' ■ ■ . ' ' 

T C« \i^0 vafiabia a catagorical variabia (such 
f • aa sax or raciplwt/nortfadpiertt) ora - 

j Gontinitoya variabia (^ch as tncoma 

orSdQt)? ! 

■ ■ ■■■ ■ ■ ' •■ ■ ■ " ■■ ■ ' 

. M O Cdtagofical Go to Step Q 

' □ Con|puoUa Continue with Step D 

p. Caicutata tha nunnarator; 

tha "sum pf sqtiarad ciaviatlofis" 

Dt Squara tha vahm of ^aoh a^pta ^aa 
«nd turn tha ratirfta 



P2 Add tha.vafuea from aif tha c»aaa In tha 
aampia togathar \» 



-(IK 



X?4. X|+X|- • -Xi-ilXf- (3) 



X,4.X2+Xg--'X„»IX,- w 



03 Sq|i»attna 4 . ^ / 

(XX,)*-(XX,)-{XX,J- m 



D4 Divide fine 4 by fine 1 ' 

(lX^)2/nm my 

D5 SiilJtrBct are 6 from fir^ 3 



\B, Cateuiate the eetimate of the population 
variance, orvfde Una 7 byline 2 



H. DMito Una tl^y Una 1 

I. Sublet Una 11 from ifne 9 
4, Ohrfda Ifna 12 by Una 2 



.(8) 



7 



F. How many casaa In the apcnpla ara rn the 
eatagoiy of Intarast? 

V Numberof Casea ln Cateoofy f* m\ 

/ - (Forewipte/tfyou a» W«»^ 

estimatino the variance of sex, how many 
• fematoa are th w fn the sam^? For 
(tocotomous variabfea it makea no 
difference vvWch category lg 6hdaw.) 

Q, Square Una 9 



f*«»f-f- nm 

f ■ ■ 



37 



rORM IS 



Developing Population Estimates 
From a Sirnple Random Sample 



STEP 



\ 



A. Ba«ic Samf)i9 Information 

A1 How many cases are In the s^pte? 

A2 How 9^ cases are In the 

\totai population from which the sampie 
was drawn? ^ 



N 



B. Ca(cuii»te the sample m^n (SI 



B1 Add the values for all the c^es In the 
sample toget^ier 



OMde tine a bVlbieT 



C. Cdiouifite the estimate of the sampling 
mean standard deviation 



2x,/n»X 



n * N 



» 

CI Usb^ fbnfn A, cateiuiate the estimated 
satt^sNe variance • 

C2 OMdeine 5 by b'ne 1 

C3 Subtract Hne 1«lrom line 2 

€4 Divide fiiTe 7 by line 2 



C5. Mvltfpfyte6by 



S2« 

N-n- 

N~n ^ 



UNE 
NUMBER 



.(1) 



.(2) 



.(4) 



J5) 
.(7) 



ERJ.C 
\ 



38 



44 



D. Set the "(^onfidence level" {conffdenqe level 
Is defined on page 1^) . \ 



Confidence level • CI 



E. Determine the "ZT or "K" value from the 
table below i — 




If Cl« 

80% 
90% 
95% 
^% 



tf n^30 
then 2= 

XM 
1.64 
1.96 
2.55 



If n<30 
tfien {C» 

2.24 
'3: 16 

4.44 
10.00 



K Of 



f. Calculate the estimate of the 
popuiatidn total ^ 



ft/fultiply line 2^by Hne 4 

G. (^feulate the com^eftce Intenrsl of the 
estfniated popula^^n total' - i 



Mul^ line 2 bf Ifeie 12 by &ie 10 

lfn^30 

tfn'<30 

H. Calculate the upper bound of the 
confidence interval 



T+CI 



Ackl fine 14 and 13 



.(12) 



.(13J 



-(14V 
OR 
.(14) 



.(15) 



39 



45 



L C«icul8t»'th0 (ow«r bound of tho 
coiilid«ftc« jmarval 



Subtract flht 14 from Ifn© 13 
J« InttrprotfitQ thft rosuita 

F31 In the blanks In thd saitancs betow. 



.{16) 



"On thd b^ of a sampte ol 



cases, ft can be asdmated with 



percM eonlktoice, the totai value of 



for itm popuiatk)h sarn^ falto be^ftfeen. 
moat likely value 



and . " ■ with the 



r 



40 



ERIC 



mm: . ^^1% 

Form C 



Oeteirnining Sample 



A Is th« vaiiab<9 to bo ««tfmatod a 
eatfgorieal Variable Isuch ^B8x or 
percaiTtage of errors) or a cohtlnuoua 

vaiiabia (such as c^iars expended or 
avwage COSH? 

□ Catsgorical vanai^ - Go to Step H 

□ Cominuoua variabls > Go to Step S 
B. EstabHth tha aceaptable Arror. 



um 

NUMBER 



\ 



.ft) 



(Jf, for i)fflmpfe, you wish tt> eetinr^ tfts 
mmmg9 w^ght of stactettt»fr» a cia» of thkty 
wftWn two ppumia write "2" on Bne 1 . 
Howwvsr, if you wWi to estimate the wtaf 
weight of studtttte jn .the class w^ 
pounds, ycKj must first cateuiate the average 
accaptabie enror (© by (^jMng the total 
^ ^ aceaptairie wor (TEJ by the mimisa- of cases 

fn tTw population to be »mp^. fn^ls case 
the average *TOf wotrfd be; 

Write the reiwit oh fine 1) , 
C. Get the "conffdenee level" 



Confidence (ev^ 

ISee page 13 fbr a deflnftfen 
of (»nidM(» level) 



41 



D. Dottmiine "Z" value from thejeaitle 
below 



If Ct- 

m% 

99% 



thanZ- 

1^ 
1.64 
t #96 
.2.S8 



E. How many cam are there in the total 
paputetion from which the sample Is to^ 
be drawn? 



J3) 



F. Determine the estimated populetf on 
variaiKe (d^Hsee page A4 fore (tefmitk^ 
ofvarfance) » , 

0^ can be estimated from: 

1, Pastexper^ce 

2* PBoft study ' 

3. jf ssinpiing fi^ an appfoxfma^ 

nomutty disiiibyted population; 

the vsHance can roughly be 

estimated as: 



52. 



25 



whcie A \s the mftge. The rwge 
is the h^hest value minus tltm 
bwest value. 



52. 



.(B) 



G. Calculate ^e mtnimutl^ heciisery 
sample size. 



;np. 



42 



ERIC 



61 S4uaf«(!nd3 



02 Muh^ fine 5 Isy Ofio 6 



G3 .Mdttiply iine 7 by Iffie 4 



G4 Square line t 



G& Multiply line 9 tsy Sne 4 



66 Add Knee to ami 7 



G7 <ih^ Sne 8 by t1 



Muitipty line 3 by hsaif . Z^^ 



mm 



£2- 



E2N + d22P> 



.(6) 



.17) 



r 



.(d) 



ff line 12 is ^ thffit 36; sample a 
minimum of 36 



H. if the veHcblt to^ astkiwM h» 
oKegorfeaf wHafiie: 



4For ttcsnpie, If yeu to est&nsteitie 
ptpfXfrthn of stu(tems >t e dssa of th^ who 
^e female wfm .OS (H- te error write ,05 on 
Itewever, If you vi^ ttj eat^tvrte the 
wt^ nurpJbetof stuctenti wiK> are femefe 
iMthlrt three stuctems, you must ^rit ^(culate 



.<12) 



Ml 




49 



the proponbn accepta^ ettor, E, by dfvkiing 
tho totsi ac(»ptabifi by the number of c^ses In 
the popul8i^ to be samied. in this case the 
IHoportion frror woy Id be 

TE ■ ■ 3 - ..... 
H 30 



Wifte the tesuft on line 13V ' 
A I. Set the "confidence level" 



Con<Wence^w^*CL» ju_-_(14) 

(See page 13 for 8 defNtion of (^nflden.c9 tevei} 

Determine tfte "Z" vafiie from the 
table below 

80% 1^ 
90% 1.64 



K. Determine the estima^ population 
pereentage for the cmt^ofY to be 
eettmated.-J!rl 



Y" 




F can be e^^iina^l fitoln^ 



1. Pastexper^nce 
Z A study 
-3. Aa»iming the "maximum 
and setting P » .5 



L How many cases are there in the total ^ 
popufiitfon from which khk sample la to 
be drawn? 



44 



ERIC / ' • ' 50 



95% 1.96 I 

99% , 2.58 . . 



.(16) 



N « . (17) 



ERIC 



(Multiply Unt 16 byr Hsdn 



M2 Sutoaet lint 16 frdm ^ 



M3 Muttlpty !ln« 16 by l!nt 19 



M4 Muttfplyandt8 byiffi8 20 



MS MuM{^lkte21 bytfnel? 



M8 Squfitm Una 13 



M7 Muftipiy ^ 23 by Una 17 



M8 AcM ^ 21 and 24 



M9 Divide iine 22 by One 25 



P(i-p>- ^ _(2d> 



2* • p( 1 - p) • m ) 



lye^- Pit - p) m _(22> 



e'l-E^- --__{23) 



I (24) 



E^N + 2^- P{ 1 - P) « ___{25) 



„, N2^:P|1^P) ^ ^26) 

^N + Z^-Pd-PJ 



ft im 2S \s less tiian san^e a 
rrMmim of 30 cases 



.45 

,/ 

/' , . 



51 



4 



USING A CAICUUTOR TO COMPUT^ STANDARD S/^LIN6 STATISTICS 




Forms 0 and E present a method of using 
a hand calculator to c(»npute population 
e^tlnates from a san|)1e. To use the 
forms requires a hand calculator with 
oipinory» square,- and square^root keys 
an^ which employs "standard algebraic 
hierarchy 5" I.e.V squares and 
square-roots ar^ pertformed as soon as 
th4 appropriate keys are prised, and 
njuSltlpTI cation awd division are 
pirformed before addition and ' 
subtract lon^ to test If your 
calculator conforms to standard 
algjsbralc hierarchy, press the 
fol Wing keys: 



If the display shows '13* the following forms are usable with your 
calculator. - - 



\ 



46 



ERIC 



52 



FoRivi D 



Using a Gatcuiator to Compu^ 
the Estlrnate of the Population 
Variance from a Sanlpte / 



• Basic Information 

How many cases are there in the sampfe? 

ft 

1 Computation 



n 



AV 



n 

(From 
Unel) 



1 









DisDiav 










0 


, Clear rmno^ and display 


0 

X, 




in ,+ 


X? 


Sum values of Xf in display and 










. "sum vafcjes ^ X, in m^ndry 












X, 








Conthue fbr all cases 










Sum of the stpijared value of aU 


■ / 


ImrI 


■ #r 


2X, 


Sum of tf^ vi^ues of aK cases 



X2 




MC 


M + 



(From 
Una 3) 



Write contents of disptey oii fine 2 



Sum of ^ squared 
Write cements of 

Tfid a^&nsted 
pqxjiation sai 




FoRivi E 



Using a Oiicuiator* to Compute Population 
Estimates from a simple Random Sample^ 



C Basic Information 

How many eataa tra thtra tn tha tampta? ^ 

How many cam arc thara In 
*tfia population from whfeif tha sampfa 
waa drawn? 



of tJ^/^ 



What {a tha aatimatad varianca 
population aampfad? 
iUaa form D to calculate the estin^ted 
popuiadon varisnca) 

yv hat la tha au(n of tha vaiuaa for atl 
In tSIa aa^tpla (Une ^ from^fomi 0^ 



8-* 



2X, 



Sal tha ''confidanca lavar 
(conftefenca faval ta defied on page 13) 



Conffdenca tav^* CV 



Datermlne the "2" or "K" vafiie from the 



table batow 





ifft^SO 


ff n<30 


If Cl« 




Ithen K« 


80%' 




1 Z24 


90% 


1.64 


3.16 


95% 


1.96 


\ 4.44 


99% ' 


2.55 


- \ 10.00 



KorZ 



imputation 



Enter 



(from 



Comments 



0 Clear Mem^ and O^iay 



.(1) 



.13) 
J4> 



.(6) 



N 



n 
(from 
iine t) 



13S@0 



(from ^ 
fine?) 

(from / ^ * 
linet) 

2 or K 0 WP 
(from ^ ^ 
flneS) 



MC 



XX, 
(from 
0118 4) 



n 

(from 
({net) 

N 
(from 
the 2) 



0 



N-n F!nfte''po|^ul8tion correction 
N 



IX, 



3? 



Sampib^ mean standard 
dev^dn 



Mean sampie vaKie 



' Estirr^ of popuiatbn totai . 
EnttN- (intents of dispfpy bn ihe 



.(7) 



T + N * 2 • 0^ Upper bound of confidence 

Enter amtants of display on 8 



.(8) 



bound of confidence Intend! 
Enttt' contents of dispiay on !(ne 9 



[nterpretlng the Resuhs 
Pf& iri the blanks In th0 sent^c^ bdow. 



CNri the basis of a sample of 



cases. It can e^mat^ 




percent coq^ence, the total value of 
sanqsled faHs betvi^n 



«)»$ 

for the population 



and 



Saws 




with the most fik^ value 



Bm7 



50 



ERIC 



56 



ApplicArioNS oF SAMpliNq 

TO STUdENT FInANCIaI Aid 



\ -■ / 

This chapter contains three examples of the potentW uses of 
samp ring statistics by the Office of Student f inanclal Asst stance. 
Although the (tetails of the ex^ are fIcttonaHzed, they are all 
based on a combination of actual cases. The exasnples are designed to 
both llTustr^je the application of statistical s«Hpl1ng in OSFA and to 
address a range of potential problesjs that could arise In those 
applications. 



EXAMPtE I:. REVIEW (F SEC^ AWARDS AT UHIVERSITY A 

A financial aid program review at University A revealed five 
overawards in the tuenty-five SEOS awards^ reviewed. As result* ^re 
Department required that the Un>^ersity either conduct a complete audit 
of their SEOQ awards or perform its own statistically sound and 
nspresentative sample and proieot the resiit^s of the safflpTe to the total 
SEOS population during the period of the audit. The University selected 
the Tattfer option and proposed a 10% simple randow sample of the 1460 
SEOS awards made during the period of the luidit. In evaluatihg the 
proposed sample the Departmafit determined that the sample would have to 
be sufficient to estimate the number of overawards within +50 and the 
amount of overawards + 50,jOOO» at a 95 percent' coftfider^e level. To' 
determine whetiter the University's proposed sample plan met these 
criterlai Form C 'front this nwnual was used. A copy of the completed form 
with relevant coonents is attached. 



r 




f ORM C 



Determfnfhg ^mpte Sizes (n) 



STEP 

A. Is viri«bte to b« Mtiffl«t«d>« 
oitcgorical vari«i»to (fiueh asiwi or 
peccantsQfl of onotsl or • continuous 
wirisbia (fuch M total (toOani vj^MndM or 
•vtrag* eoidi? 

\J( Citaooric»f vvmbtt - Go to'stap H 
□ Comii^toui variabit > Go to Stap a 
EatablMt tha «v«ni0« ace^t»btt trror. 



{ff, far axiini)!*, yoo wW» to aat^nitt tht 
§¥mg& waigtnof atudarrtKnacteof thMy 
wttt^ two pouinit wrtw'f ' on ft* t. 
Howavar, ffvou-wfan to a^inteti tba mtif 
; waigfn of jtudanti h tha ^ ^vltMn tyMtvt 
pouiKlf, you CHMat f&st ea(»rfita tha avwkga . 
•eo^jtaUa arror (e W <MIIn9 tha totit 
•ceaptiWa anw (TE) by tha flumbac of caaai 
in th# {x^titioit to ba«iitH>iat^ tnthii caaa 

Writa tha raiutt on Ibw 1} 



C. Sat tha/'oonfiilano* favaJ" 



UME 
MUMBER 




J1) 




iSM pig* IMer i cNrfinltion 
of confidMCi I«vt0 



In this cftS6» the sflffipla 
will be used to estimate 
both the dollar eiioynt of 
overpa^^mtnts, {a contlnuovs 
variable} .^nd ti?e mi*«r of 
ovifpavwntj (a categdHcal 
yarlabTe). 

for such fmt 1 tip fe use 
samples* the necassary 
san^le size should be 
detemlhad irtdepend^itly- 
for tach use and the 
largatt reswltint'iittdMtte 
of mtnl^ nacetfary sssple 
if It utmi. 

In thif ax«q?te m «i^tl1 
f ifit' if tfmte th« 
ftaefiftry fiwple tUf 
A«t4id tP detemldt the 
rm^r of SE06 over^irardf » 



7- 



ERIC 



53 



59 



G1 S«t*^Bne3 
■ G2 Multipiy line 5 by tine 6 

CS4 &}yaf« bit t 

» (36 Add GnM 10 and 7 

* - 



NkiftiplySne3byits«ff. Z^- 



H. if ^ vaH»bi« to b* •ftifTwtad is • 
^tagarieaf wiabtK 

Eatabfiah tha propitnion aecaptabta aitor 



'(For ttoHfi^, if you to 8C|^ratath«, 
/vci^irtbn of g^Ktentt in a da» of ti^ wtw 
ara fama(a with .(£ or fata wror wriia'.OS on 
Ana 13. Howav^, y«i vwah to a^mata iha 
roa^ /M/m6«r of ntKfaKtts wtK> ara farnrie 



.(61 



.m 



.m 




Tht leva? 6f acc^taola 
srror was c^lclad to be 
SO overajittrsis. To cpnvert 
this ffgura fnto proportion 
acceptable error, the 
following formula wss used: 
H • 11 - SO • .034 



/ 



54 



0 



%181/llWV XS38 



4 



7 



*■ 

tht proponion acnpote 4iriw 
th* toni Motpnbtt bylths mtmbw of cam In . 

N -30 
. Wrte th« twiit on fin* 13i 
K $«t th« "(^nfidmi* l«v^ 

ConfidMica latfil«'Cl« . 



J, 06ismiiti& thft ''r' va^ 
t»bi«b#fow ^ 



tfa- 
9d% 



2J8 - 



Vitiated, {p^ 

2L Apitorstiidy 
mtSwm&tgjPm 3 



L Hg»W fiwty 0M«f «r» xHmn in th# totirt 
population from whteh tht Mmplf It to 



m 




The Initial rsvtew found S 
overwari^ t^t 25 files 




ERIC 



61 



BEST COPY AVAiLABtE 



7 



M. CMeutttf tii« mMmum fwcMsary nmplf 










• 


♦ 




M2 S*j4mictDn» iefromt *\ 


1-P» 




M3 Muttipiv 10 oy im I9 


pn-f?y». 


- (201 








Mo IVUillipiy Zl Mlli 17 






' f 






V 


e'N^^-pti-P)-" 




r m Oh^f Ibn « bv flnt 28 . 


f«E8-Pft-PJ j 










^ ^ * tf tti«^ li{Mltt\in30^iinr^i 


1 . ■ 






I- 



ERIC 



56 



BESTOnPv ^"fl!lABLE 



Thertfore, the fljihimunj sisnple size necessary to estimate the tot^ 
miraber of ove/rawards + 50 1$ 386. To detenuine the sample necessary- to 
estimate the total amount of over awards + $50,000 the equation for 
ffllnlmuoi saple size In Appendix B \«as used directly. 

In our exait^le: \ 
N « J460, the nufBber of SEOS awards during ,the period of the audit 
£ * $5Q.ado » $3^,25 the average acceptable level of error 
1460 

Z l/*96 the 2 value from T&le A.16 associated with a 95 jjercent 
-/confl fence level. 



Data fron the program review was used to estlflate 
population variance. The program review found overawMs in the amounts 
of $1,000, $7^, $449, $300, and $1,625, as well as 20 cases with no 
overawards. From Appendix B, it was found that the sample based 
estimate of the popuTati<^n variance is: 



X 2 2 ^(x, - x)^ ^ 

* ■ ■ n - i 

, ■ ". ' ■ • . . ' • - ■ 

Where: 1 x 

* * :■ XOOO » 788 -t- 449 » 300 162 « 166>48 

' ' " " . ^ ' ' .... 

2:(X4 *. x)^ - {100 - 166.48)^ 4^ (7^ - 166.48)^ * 

^' ' ' . * ' . . ' ' 

, • , ^300 - 166.48)^ + (1625 - 166^48)^ + 

' ao • (0 - 166,48)^ «^^90126,6 

~- g ' ■ • 12S7S5 ^ A /- • 

n - 1 ' \ 

Substituting these values into the fbnmila for sa#le size, we <^tain 

■■•('■ ■ >' " ; . ■, ^* . ■ ■ ■ ... ...... ■, , 




* . (1460) , (1.96)^ . {12875f) » 
(34.26)^(>460) + (1*96)^ (128755) 
- 327 ' 

Therefore, We minimum sample'sixe necessary to estimate the tot»l 
amount of overawards + $50^000 is 327. B^ause the/san^le siie necessary 
to estimate to number of overawards was larger (38^), the University was 
required tG sasiple a i^r^imym of 3^ cases, inste&d of the pn^sed sanple 

of 146 cases. * / ' • 

. , , • .... \ . . . ■ 

58, 



University A conducted t^e required saipple based audit and reported 
the results in Table 5.1 to the Oepartsnent. 

TABL€ Sa SEQ6 AUDIT At UNIVERSITY A 



Nuisber 



Amount 



No error found 
Overawards 
Kissing affidavits 

Total 



346 
23 
17 

386 



$199,220 
14,623 
10.217 

224,060 



FrS^ Lte . NuBjber of Overawards Nuntter Missing Affadlvlts (17) 

crrur««M! - h ' Sample Sfze (386} 

• 10j3S« 



Net SECffi Awards During Period ;of Audit: $8'49,481. 

Estimated University Liability - $849,481 X 10.36% • 

(Net Awards) (Error R4te)\ 



$88,006 



The Oepartroent rejected the estimate of total liability because the • 
University emprloyed a faulty coraputatlon. «^thod. M estimating liability 
the Onl vers Ity had calculated the error rate on the bas Is of number of 
errors rather than dollar "Snouirt of errors. The correct error rate Is: 

Amount of Overaierds » Amoimt of Kiss 1f^ Affidavits, 
Sample total SkUU ^ * ^^'^^ 

Therefore, jestlmated Wfverstty liability Is; 
$849,481 X Xi.m - $94,176. - 



ERIC 



59 



65 



EXAMPLE 2: 8E0S APPLICANT QUALITY qqNTROL ST^OY 

Statistical saiijpTIng was employed In a quality control study of 8E0G 
applicants because it offered a wide variety of advantages in the 
analysts of the universe file of applicants containing ove^ four minion 
records : V - , i 

f " Sampling inyolved substantial cost savings ^ In advance, it was 
estimated that the study would require a minliiium of thirty 
computer reads of the application data. A single computer read 
of the entire fi Te cost approximately $2,700. Therefore, 
analysis of the entire universe file would cost a minimum of 
$81,000. A sample of 20,000 appllatlons wwld cost $2,700 to 
construct, Hov^ever, after the sample had been drawn, each 
additional coc^uter read cost only $21 for a total study 
computer cost of $3,510, a savings of $77,410. 

• Sampting inyoduced only very minor error ^ Fdr exanple. In 
estimating «te percent oi^ applicants attending pirf>ric, 4-year, . 
instltutldns, the standard error of estimate was less than 
three-tenths of one percent, 

• Sampling speeded the completion of the study* A complete read 
^ the App 11 cation File usua^y requires five hours of computer 
time and has an average tum^around tlow of f ive ^ao«s using the 
D^art^nf s COMNET facilities. The data file containing a - 
SODple of BEOS applications usually only required « few minutes 
of computer time for each run and had a ttim^around time of a 
few hours. • 

• Safapling allowed use of a wide range of statistical Packages 
such 4s SPSS, SAS, ^IftlS and SMITi^lilch are not jirKticaff or a 

\ file the size of the BEOQ Applicant File. 

» Sampling made possible a wider range of analyses than would halve 
been poss f ble If the entire f 1 1e iiS been ui^llzed . Si ven the 
high costs, long time lags; and limited statistical software 
available, a populatlon-iiased study could not realistically have 
explored as ^nany topics as a s®ipl6«based sta<Sf. 




EXAMPLE 3: REyiEW Ol^ CWS AUDIT REPORT FROM UNIVERSITY | 

'^'^nj^ / ^ ■ ; ■ ■• . . . 

The. Department received an audit .report of CWS awards at University 
I, Table 5.2 sunraiir1zes the\audit's findings.- 

■ .\: 

TABLE 5.2 AUDIT REPm OF CWSV^mROS AT UNIVERSITY B 

■■' / 



Total ifflount of CWS »<ards durlflo the period of the Audit: $833,118 



Number of CWS 
Simple Random 



Awards: 
Sampje Size: 



Number of Ove|awards Identified: 




Error 

Students engaging In profit- 
making activity for tfte Institution 

Students not maintaining satisfactory 
progress In their course of study 



Students In default on a NDSL loan 
» - ^ total 





Amount of 


Number 


Overawards 


2 


$1,183 






3 


300 




685 




912 


J, 


1,326 


6 


$5,148 



The Departj»nt was conceited with i*ithtr the s^^^^^ 
adequate basis for projecting total University CWS overawards. Td this 
end. Form B In this mamisl was.usedi The completed form with relevant 
comments Is attached. 




61 



67 



Form B - 

(Developing Population Estimates 
' Ffom a Simple i^anckmi Sample 



UNE 



A Bnie Sampte'iftfonnation ■ - 
A1 HcMr nwiy csaw are in trw ««T^7 



A2 Hmrtmif cummin thi 
toatf popuiitkm from 



B1 Add th* vahNW for afl tht cttw tfi the 
82 0Mdilina3 bvtin0l4 



2ZL 




C. CafettlRa^ fatittiatii of m sampOnt 
maan Mamlatd davlation 



CI Uiing form A, cHeuJots ttn aatimatad 



C2 0Md»iw$Wfi>it1 

4_ C3 SuteraA Bna 1 from fina 2 

C4 pMda&w7by&wa 

C6 M(dtipiy gna6byfine8 



{n this exai^la na will fortgo usa 
^ of Fom A m catcttlata tha 
¥«rl«flca <Jf<^ictly: 

2 Sxi - t)^ ^ 

(>42 - 93.8)^ , ♦ 
(3dG - 93,fr ♦ 



H-f\m 



N 

n N 



• (4617445.2) 4 (54) 

• 35508.24 



V. 



S2 



ERIC 



68 



C6 T9kettwsquafeit>otof ftieS 



The Department ysed the most " . " 
nO). coOTonly i«epteil confidence leveT^ 
of 3S« to estimate total lla&nity* 



0. S«t the ''cofnfUton^ i«v^'Mam^^^ 
o defied on pa^ ) 



E OotermiiMi nm '^or "K** «riu« from tht 
tabic Mew 

■ . / «» 

ffCL- 

80% . 1^ 

90% 1.64 

95?^ 1.96 

\ 89% . - 155 



F. Caiouicts th« Mtitnaxt of tha* 
PQPitlationtotaf 

G. CateuiatB ^ eonfidenca in^uvai of tha 
i^malad p(j|H><8tion4||t8f 

*i» ■ ■ , 

MultifNy One 2 by Bra 12 by Bne 10 
ff n^30 " 

ffn<30 

H- Calculate %b upptf bound of tfw 
Mn^fiahca intofvat 



|fn<30 
then X« 

2^4 
3.16 
4:44 
10.00 




KorZ- 



.(12J 



(13) 

- . ■■■■ V ... 



OR 



{14) 



T+CI 



Add line H and 13 



(15) 



63 



69 



1. CaJetilct* the loww botin^ of pm 

" Sutatiiet &nt 14 fiom fin* 13 

J. tnt^nprstlng th« rtaulti 

Fl in tht {jtentv in til* Mtiwte* biiOM', 



thi tjMii of • iwipto «rf OiWt it w b# 



ffloft BMiy Milut 

mis ■ 



Tht resulting confidence Interval Is obviously very wide. The tow limit 
of the Interval, Sl7,f9f Is lets than one-nlntli of the high Itmlt of 
$163,771. Therefore, the Oepar^nt determined that the audit sample was 
an insufficient tiasls^for projecting tota^l University llablMty. 



64 



ERIC 



V 



70 



SUffWRY ^ 

The three examples contained in this chapter represent only a small 
fraction of potentVal statistical saiapHng applications in OSFA. 
N(»ietheless« taffen together, the examples demonstrate that statistical 
sampling can be very straightforward and need no^t involve overly complex 
calculations. The many advantages of stattstical samp ting can be 
realized in a great diversity of situations through f^iltarity with the 
ba§tc logic of sapling and a few" simple formulas. 



-- r 



65 



71- 



ERIC 



APPENDIX A 



/ ■ 



INTRODUCTION TO SA^f»LtlllS STATISTICS 



The statistics of sampling are presented In several different aw* ' 
largely Independent w^ys. This appendix is a short i^ttroductlon to, and 
explanation of, basic san?)ltni statistics. Chapter 4. Computing Saiiiple 
StatisMcs, contains a series of forms to aid in calculating a variety of 
coBuion sample statistics. Appendix B suramarlzes basic sai^ Ting fomw las 
and symbols. Finally, for those who would Tike a fuller explanation of' 
sampling statistics or mpr^advancal or. specialized statistical 
ft|lnfomat1ofl, Appendix C presents a short annotated' bibliography. 



To Introduce the statistics of sampling we will consider Artificial 
University vAere' there ^ere only six student financlaf aid recipients In 
1980, The resuUs.of record review that included all six clients are 
presented In table A.I. I 

TABU A,1j FIHAttCIAL AID RECIPIENTS AT ARTIFICIAL UNI VERSm 



* 


- $£06 


SEOG 




^ student 


Award 


• Eligible 




# 

A 


$740 


yes 


$ lOO 


B 




no 


.1.500 


C 


800 


t yes 


300 


0 


672 


^ no 


• 1,500 


E 


800 


no 


100 




700 


yes 


r.olo ^ 

* 


Total SEOG Awards 




$4;512 


* 


Nuinfaer of SEOG Overp^ents 


3 




Total NOSL Awards 




$4,510 





ITie inforraatlorf abmit Artificial University will provide th« basts 
for our Introduction to sampling statistics. First, a set of suwnary 
ineasures for describing population paraneters will be presented. That. a' 
series of B»thods for est Inatlng those population parameters on the basis 
of data from single random samples w1 IV Vol low. A word of caution is 

^ required H^re. THE iETHODS (^SCRIBED BELOW FOR PROJECTINS SAMPLE RESW.TS 
TO THE TOTAL POPULATION APPLY mV TO SIMPLE RANDOM .SAMPLES AND 
SYSTEMAtiC; SAMPLES. Other sarn)le designs, sjuch as stratified cluster or. 

.discovery siwpl^ns ej^Toy different formulas, m 

TOtAL, POPULATION SIZE, ^KAN, VARIANCE, STANDARD DEVIATION. AND 
DISTRIBUTION 

. - _ * - 1 _ 

~ A primary use of statlstlcs~Ts to sunraaf^e"c6fl^)1i^~^^ to a few 

s1fi^^J^e measures. The first step In suinnarizing data is to note* how many 

cases are in the population under review. The number of cases in the 

population is usually symbolized by a capital 'N*. For Artificial 

University (AU), N»6, since there isre 6 financial aid. recipients. A 




A2 

« , " - 73 

ERIC 



secofui coBWon susnary-i^easure Is the popuTatfon total. The population 
total Is usually symbollxid by the Srtek letter t (pronounced tou). The 
population total, t, -fs calculated by sumnlng aU the values of all the 
IndlvlduaT cases. Imll vidua!, case vtlues are syntoollzed by 'x^*. The 
operation of sunwlng all ^e cases In the population, can be syirtiollzed by/ 

, £, the larse Greek letter slgisa mans "the sum of". Therefore, S 
raeftns "the sum of all Indlvldjal cases," 

, Sx^ • + + x, ......xu (ttoe Nth, or last 

ease In the population) 

In AU the total value of SEOS awards Is; 

>7« Tx^ - 740 800 + m + 672 **800 + 7Q0 • 45: 

The nuaber of cases In the population and the population total can be 
cofflblned to produce a third common sumnary measure; tne fiiean or average. 
The popuVatloi^iaean H syinbolUed by tiie snia^l Sree/letter ' 
(pronounced *m/y/oo') and Is calculated by dividing the population total 
( t) by the number of cases In the population {Hf, thus: 

li * t m 21 X | 

T — ?l~ 1^ (A.l) 

For AU, the mean value of SEQ6 awards Isi 
fi ' t n 4512 - «752 

nr — 



Table A.2 sufwnarlzes SEOS awards arxf- miS loans at'Artlflclal 

University In terms of 'number of cases, total and nean. 

TABLE A.2: XOG AiiARD^ AND NDSl AT ARTIFICIAL UNIVERSITY 

\ 4 . - — ■ 





SEOG 


vNOa 


Number of Cases (N) 
Tofal Awards (t) 
Mean Award (m) 


6 

4512 
7S2 


^ . . 1 IN 

6 

-4510 
, 751.67 



Nmnber of cases in a population* Wulatlon total and population mean 
are generally not; by theaselves, sufficient to describe and adequately . 
suniBarize the data UFKter study. Table A.2 reveals practically no 
difference between these sunrnary neasures descrilj^ing SEOG and NDSL.. 
However, returning to Takle A. 1^ vie can.^e thajt all SEOG award ambunts 
are ctu.stered between $672 and $800 wherea^ NDSL award amounts arefrauch 
more variat^le, ranging from' $100 to $1,500. To represent this in^jor tan t 
difference,* a measure of d4spersion (or variability or spread ) is also 
n^ed. As the words "dispersion", •'variability", and "spread" suggest," 
the measures that sunmaHze this characteristic indicate the extent to 
which individual cases are scattered about the meaft. * 1 

The two nwst cc««Jon nieas«res of dispersion employed in statistics are 
'variance' and 'standard deviation'. The variance of a. population is 
represented by the $yn4)ol ' (small sigma squared). The vari ancle of s 
a population is calculated by the formula: 

Where: ^ 

(y2 a the varlanbe 

» values of the Individual cases ^ 

• » the mean value ;0f the cases 
H • the nuiflber of cases in the population ' 

The steps involved In the calculation "of the variance are: 

1. The total value Is computed (t). - 

2. . The mean vajue of the cases is con?>uted (m). 

3. The deviations of the individual award ainounts frc^ the mean are 
coisputed (xf - K ). . 

4* The deviations are squared then totaled 

i -fH^^ ur^, — : ^ - 

5. The sum of the squared deviations is divided by the number of 
cases, 'N'. - 

Table A. 3 IHustrates calculation of the variance of SE08 awards In AU,, 



^4 ^/ 



TABLE A.3: 


CWUTATION OF THE W\RIANC£ 


1 , ^ 






1 

Suioent 


SEOS 
nWcira\ 


3 

- Mean 


4 


- ■ - ^5 ^ 




: .'A ■ '. . 


740 


752 . 




144 




B 


8CK) 


752 


^ 48 


2304 




. C 


800 


752 




2304 1 




D 


672 


...1 752 ^^„'; ' 


-80 


6400^ ' 




£ 


.800 




•to 






F 


700 


752 


tS2 


• 2704 




Tot^l • 


• '4512 






jf\2 iAyinA 

H) ^ 10400 








n ^ \ 1 






T « 


• 4512 

T « 752 
H 










• ; ^ m 












H 






\ 






^ .1733 











The variance of SE06 awards at AU Is 1733. This value represents the 
i average variability of squared dollars. To obtain a measure of 
dispersion expressed ^n terms of the original values* we calculate the 
standard deviation. ^The standard devlatflon, v^lch is symbolized by the 
small Greek letter •cr', is tfie squ'lre root of the varlame. The formula 
for the standard dev,1 a tiort Isi 



2(X| - /i)' 



(A.3) 



The stan?lard deviation of S^'oG awards at AU Is $41.60. The much gr^eater 
dlsperslorr of ND% at AU isj represwrted by a standard devlatloh of $611. 



ERIC 



A5 



76 



1 . 

r. 



Armi8UTE$ " 

fio this potnt the disfcufsfon has fofcused exclustveTy on continuous 
varlebUs such as dollar fnount of SE06 awards* Howiver, statistics can 
also bt applied to categorical attributes such as program eligibility, 
i*1ch have no natural numeric values attadted to them. The question 
therefore arises as to how to calculate totals, nieans, standard ** 
deviations, etc. for case attributes, to 8lve categories a mathematical 
representation, cases In the category of Interest are effioaonly assigned a 
valuil of 1 and all other caiei are assigned a value of 0. 

Categorical case attrlbutes^can be sumnarlzed In terms of frequency 
and proportion. At Artificial University, 3 students, or .5 of all 
financial aid recipients, arfilSEi^ eligible. Population frequency will 
be symbolized as a lar^ *F* and proportion of the population having a 
certain attribute by a J^rge *P*, Therefore, for SEOS eligibility at AU, 
F«3 and Either f requefa:y or proportion can be uied to calculate 

population mean, variance ami standard deviation:/ <• 

Frequency ^ Proportion ^ 

^ T - r T - P • f« (A,4) 

M «F/M M - F . (A.S) 

F , F^/N ' <y^- F (1 - p) ; ./ (A.6) 




T 



<A.7) 



-or SOS eligibility AU, F «- 3, F « .5, ^ » .25 and » .5. 

One additional way data can be summarized is to graph Its frequency 
distribution. Table A.4 presents graphs of frequency distributions of 
the data' contained In Table A.i. As Table A.4 shows, frequency 
distributions can be shaped In many different ways. 

For reasons that will become clear below, sampling statistics make 
frequent us^. of one particular type or shape of frequency distribution; 
the noratal distribution. 



A6 - ' 

1 » 



77 f 



miE A,4j frequency QlSTRIBUriON OF Simin FINANCIAL AID RECIPIENTS AT 
ARTIFICIAL UNIVERSITY 



\ 



I 



of J^^TdO 
Award 



SE 06 Award Aaiounts 



SE06 Eligible 



1 



3' 
2" 
1" 




Tes 



No 



SE06 Overpaynients 



NDSL Ainounts 



3 

Aitount I 

of ?ro 

Award 



670 



2 

r 



ixir mr vm 



"The iiorrtial df strfbutlon ls a fT?equency distribution which ts, 
bell-shaped. Table A. 5 gives 30'^ example of an j^jproximately mrU^ 
distribution. The results of a sa^le of the ^T srath scores fori 10.000 
high school seniors are graphed In terras Of frequeitcy of test score. \ 

Relative frecfuency of occurrence In a norraal distribution Is goveri^d 
by distance from the mean measured In standard deviations. In Table A, 5 
6S percent of the cases fall within one standard devtatldn-^of^Jthe^ -average 
score of 500. Because, for the SAT tnath scores, the standard deviation 
fs im points, approximately 68 percent of the scores fall between 400 
and 600« Similarly, aipox1mately 95>4-p€r€ent of" the' cases faTT withlfr 
two standard deviation^ <^ the mean and 99./ percent of the cases f^ll 



wjthln three standard deviations. What 1s true of SAT math scores Is 
true of any r^jnually 4^str1buted variable. In ar^ real situation a 
distribution^ at best, win be only approximately normally distributed. 
However, in many situations, the approxli^tlon is vefy close. Table A.6 
giv^S the perpent of cases in tern^ of d| staice from the siean for normal 
distribution. 1 ■ 

TABtE A.6 : fWRHAl OISTRIBiJTION 



percent of 
Cases 


Distance from the Mean 
Measured In 
Standard Oeviations 
( I values) 




50.00. 




.67 




60.00 




I .84 




70.00 




1.04 




80.00 




1.28 




90.00 




1.65 




95.00 




1 1.96 




.98.00 




2.33 




99.00 




2*S7 




99.90 




3.30 




99.99 


1 , 


r 3,90 



To deterujine ^ range around the n^an that colt tains a certain 
pre-specifled percent, of cases, m use the following fornwla: 



(A.8) 



For example, if we wanted \o know the range around the mean that 
contained 95 percent of the SAT scores we woild^loek up the I value 
corresponding to 95 perc€*»t, which is 1.96. We know that standard 
deviation of SAT scores is 100 and the mean Is 500, Substituting these 
values into equation 4.^ we obtain: 

4-lG ^ - — : 



± 1.96 100 
+ 196 
to 696 



•m 500 

-500 

Therefore, 96 percent of the SAT scores fall between 304^ and 696. 



/ 



A8 



POPILATION ANiySWU SYMBOLS 

,To clearly distinguish between suianary Bieasures v^ich describe a 
population ami those ttat describe a san^le, different sets of syad)ols 
4re used to represent the two sets of measures. As indicated earlier, a 
targe N is used to s^olize the size of the population. A small, n' is 
sed to syntolize the size of a sample. When a sample statistfc is used 
• to estimate a population parameter a • is placed over the symbol to ' 
indicate that it is an estiiiaie. For example, f, is the syirdjol for an 
estimate of the population totah Table ^J suanarizes the symbolism 
used In sampling statistics. 

jmiE A. 7: SAMPLING SWOLS 



•vai 

1 



\ 



■ f 



Population Sample^ '■ W 

Sutamary Measure Syi^ol Syiimil 



Number of Cases N -n 

Total T f 

Mean (average) m 



X 



' Varianc^e %2 

standard deviation a . \ s- 

Fretjuency^^ , F f 

Proportion ' . P p 



WBLtA.Si £IWn€ or •Bim^D^^^^ for » SM<»1e of 10,000 High ScJiool Senl<»-$) 



' 100- 
95- , 
90- / 
«5. 

■>s-.80-' 
75- 

55-- 
50- 

40- 

30- 
25- 
20- 
15- 
10- 
5- 

"test 



'- I 




\ 



7m ?2s 















# 

if' 






■Br: 





























625 650 671. /OO 725 750 



■A 



775 800 



ERIC 



8(1 



82 



.ESTimtiNS TfiE POPiOTION MEAN ♦ ' 

AHhougft we knoW the wean value SEOe grants "at AU we wi n act as 
though tJils xatue -Is unknown^ to us, and win Estimate It through simple ♦ 
rafldoffl sampling.. We begln^ with a sample size of 2 . The nuirtber of cases . 
in a sample is repres^ted by a.sman 'n*. -In the population of 6 SE06 
.j^cipients,at AU, there are 15 possible simple random ^samples, without x 
replacement, 6^ 2 cases' each. Table A. 8 lists all possible samples, of 
two students^in Column: 2. The,mean value of SEOS for each sample' is 
lisf^d in ccjumn 3. . - \. . ' • . ' \ . 

TA8LE A.8: Sftm-ES" ;S£OS RECIPIENTS AT AU t * ^ • 





Sample • ^ 
[dumber Students 



Sample 
^tean 



Population 
Total 
Estimate 



Zrror of 
Estimate 



Squared 
Err6r of 
Estintate 



\ 



2 
3 
4 
5 
6 
7 

8 ' 

9 

10 

U 

12 

14 Vl 

15 



AB ' 

AC 

AD 

AE 

AF 

Ba 

BE7 
BF 

CD, 

tE 

OE, 

OF 

£F 



770 
770 
706 
770 
720 
800 
730 
800 
750 
736 

750 
736 
686 

7SQ, 



4620^ 

4620 

4236 

4620 

4320 

4800 

4416 

4800 

4500 

4416 

4SC^ 

4500' 

4416 

4116 

4500 



- 18 
18 

-46 
18- 

-32 
48 ' 

-16 
48 

^ 

-16 

48 . 

- 2 i- 
-16 
-66 

- 2 



AVE^G^ 



752 



4512 



324 

324 
2116 

324 
1024 
2304 

256 
2304 
21 

256 
2KI4 
4- 

256 
4356 
4 



1077 



— ■ . ■ . . . ,. ^ ., ., . . 

If we ex^lne the 15 possible saft9>:ies lis ted in Table A,8. we see 
variation ifi tfie results. Sdmp|e 6,^ for 'instance,^ has an average of 800 
•'^^ a»| average of ' 686. However" computing" the average 
^^^^^^^^ * value of $752; the exact value of the 
^rosfn cf^lJlilation. ^Iltts Ve^ifc is of great "irnpor^ 



AU 



deroonstrates that sin^)^e rarwjoro sjsnpling will on the average produce a 
saaple oean (x) which 4s <qual to the population si^an Therefore ^ 
can conclude that the sa«iple mean js an unbiased estimate of the 
popuUtlomoean. ('Btas' Is defined on page 11 ). 

ESTIWTINS THE POPUUVTION TOTAL AND STANgARO DEVIATION - 

The sainjle mean can be used to calculate an estlnmte of the 
population total ( CO Tuan 4, Table. A.8). The appropriate fonmi la is:' 

The ♦ A • over the • r • indicates that it is an estimate of the population 
total. Because 7 is an^unbiased estimation .of the population mean, 
N • H is an ^nb'tased estimator of th6 population total. This fact is 
Illustrated in col um 4 of Table A, 8. An estteate of the population 
total is calculated on the basis of each of the 15 samples. The mean 
value o'f the ^pulation total estimates is $4512, the exact value of the ' 
true popuUtion "total. ^ • 

The sapple-based e^jtlmate of the standard deviation of the 4>opulation 
? is^isitsolized by a small H^ The formula for a continuous variable is, 

/^(x/- x/ 

Where: . . . 



(A.IO) 



\ x^ is an individual case value in the sample 

X is the mean of the sangjle 
Jf-^ n is the ntflBber of cases injpie sample 

^ Tlje formul a for s for " a categorical variable Is,- 
5 / n - 1 * . . 



(A.U) 



Where: 

y is the frequency of the category of interest' 




ERIC 



A12 



84 



The equations for s are identical to the equations for cr with the 
exception of the -1 in the denominator. The stamiard deviation of a 
sajuple Is, on the averdft, less than the standard deviation of the 
population amJ is therefore a biased estimator of the population standard 
deviation without the corrective f ictor of r^uclng 'n' by one. 

STANDARD ERROR OF THE I€Af* 



Although the sample mew Is an unbiased estimator of the population 
mean, as Table A.8 lllustfates, there can be greW dispersion of sample 
means. One me-asure of the accuracy ctf the sampling plan is the mean 
square error (MSE) of the estimates of the mean. 

. MSE* (error of estimate)^/(nuraber of samples) (A. 12) 

Returning to the data in Table A.8, the error of estimate for each sanqjle 
can be found in colimn 5 awl the squared error of estimate in coluiwi 6. 
Substituting these values Into equation A. 12 we obtain: ' 

MSE - 16160/15 • 1077 . . * 

To state the error of the estimate in terms of ^ollars rather than 
squared dollars we take the square root of the MSE to produce th& 
standard error of the mean; 1077 « $32.82. The star^ard error of the 
iBean Is a seasure of the rei1afa1lit)f or precision of a samplftig plan. 
The standard erroi^of the mean 1» the standard deviation of sample means 
and symbolized by '' a-i , 

In actual practice we almost never have the da^ necessary to 

directly calculate the standard error of the mean for a sampling 

procedure. We usually do not know the true population mean arri^ draw only 

one, not fifteen, samples. Therefore, an alternative method of 

determining the reliability of a sampling procedure 1s Tieeded. 
' ' .. . ■ ■ • ■ \ .. 

Fortunately, the standard error of the mean can be estimated on the 

basis of data from a *fngle sample. For simple rareJom sampling from a 
-flnlte/pdpuTatldn without ^emplacement ^t^^ for the 

estimate of fs? 



* - / 

s / H'tr I 



>/n V N 



ERIC 



A13 

. ■ 85 



SubstUuttng the ei^uatfon A. 10 for 's' Into the equation we obtain: 




\ " When sampling from an infinite population or sampling with replaceinent 
I the fonaula for % can be s1n|)11f led to: 



(A.14) 




. s . / ^^^1 ^ , (A.16) 

For a sufficiently Targe sample (n s 30)»ff^ will be approximately 
normally distributed with mean of ^. Thismathemat leal fast, known as 
the Central Limit Theorem ^ is .significaht because It allows calculation 
of confidence Intervals on the basis of knowi characteristics, of the 
normal distribution. - 

We know, fr<*i Table A.6, that 95 percent of the cases fall within 
1.96 standard Jevi at ions of the distribution mean. Because the central 
limit theorem states that the mean of the distribution is m» we.can . 
conclude that 95 percent of all sample means, will falT within 
1.96 ^^JJ of the true population mean, M . In other words, 1f 1000 
samples of tlie same size were drawi from a single population, 
approximately 950 of the sample means would fall within 1,96 of the •• 
' population mean. y- 

1^95 percent of the possible values of x fall within 1.96 <^ fC of fj, , 
then M will not be further than 1.96 a^^' from 95 percent of the possible 
values , of X. This leads us . to th^e final step In our reasoning, the 
pay-off; If we estimate a confidence Interval of x * 1.96 ^ ^ and if 
we construct a large number of such intervals, 95 percent of the Interval 
^ " estimates Wi 1 1 inc lude : Therefore the comroonly used phrase: • ^at 95% 
confidence. " - <^ 



«9 •^■^9 



A14 ' ^ , ^ X . > 



• 4 



CONf^rOENCE INTERVAL FOR THE MEAN 

The forwuU for ctlcutattng Uie confidence Interval for the mean is; 



CI-- • I + Z • or - 



(A.U) 



If Me substitute equation A. 15 for into equation 4.17 we obtain: 



XI 7 * X + Z 



i-n 




* " ' n (n-1) 
The ^teps involved in the calculation of the confidence interval are: 



(AaS) 



1. 
2, 



The sample mean (X) is confuted 

The stamlard erfdr of tie mean is i^omputed by: 

2A The sura squared deviations* around the ^an is c«nputed 



2B 



>20 



( S(x-.-X)% ) 

The denionfiftator, n(n-l) is computed 
The finite population correction" 
M - n 

1^ , is computed 



3, 
4. 



The results of steps 2/\» 2B, and 2C are substituted inta 
forisula for A.15 and the square root taken. 
The Z value ,4 obtained from Table A.6. ' * 

The sample mean trte Z value, and the standard error of asean 
are substituted Into the equation for the confidence interval. 

» • , 

Returning to Artificial University records, Me drm# a three-student 
simple random sa^le of $EOS recipients to estimate the average grant 
amount of the populatiqp with a 95 percent confidence Uvel. From Table 
A.l the studerts sel«:ted are A, C, and D. Because we have, selected a 95 
pertait confidence level, from Table A,6, M.96, Table A.9 inu;strates 
the. calculation of the confid^ce intervaUvX ' 



* 



A15 



ERIC • 



87 



TABLE A. 9: EXAMPLE OF CALCUIATINS A CONFIDENCE INTERVAL 





Sample 


SEQS " y 




W IrUwCli V9 






A 


740 


V 


c 


800 




0 


672 


step 






1. 


X - i « 


740 + 8(X) + 700 « 2240 « 746.67 




n 





2. g~ * / ^^^1 • , N>n 

2.A £(x^ - x)^ - (740 - 746.67)^ + {800 - 746,67)^ + (672 -746.67)' 

• 44.44 + 284.44 + 5S75.11 • 8464 . * " , 
24 n(ft-i) • 3(3-1) - J(2) • 6 ; 

2.C _N-r» 6-3 • 3 ' - .5 



2.0 



. ^- ^y-^^- .5 - /1410.67..5 >* /705.33 * 26.56 



3. Z • 1.96 (95* confidence level) 



4. - CI- • X .+ Z 

• 746.67 1 1.96 . 26.56 - 746.67 + 52.05 

» 693.95 to 798.72 

. > ' . . . • . . ' ' 



A16 




88 



SMALL SmitS 



The confidejcte interval obtained does, indeed, include thfi true 
population niean value of $752, This result, however, nwst be attributed 
to good luck rather than good statistics. As already indicated, the 
estimation equations that were. employed assume a samfile size of at least 
thirty cases. For sauple sizes under thirty, the Central Limit Theorem 
is not generally applicable, Tchebysheff 's Theorem can be us^g^an 
alternative to the Central Limit Theorem for making population estimates 
on the basis of a jsmall scHHple. Tchebysheff 's Theorem states that at 
least {1-1/k2) of |a set of measurements will lie within K standard 
deviatVns of tbeilr mean, to empl^ Tchebysheff 's Theorem, sinply 
replace tfie 'Z** value in equat^ion A»17 with a 'K* value from Table A.IO. 
Thus, fof a small Sample, the equation for estimating the confidence 
interval becomes i 



t 



cr-. 



X 



(A. 19) 



To calculate the correct confidence interval for the sample of three 
AU SEOS recipients, the same steps as before. are performed, except, in 
thi? case Table A.IO is used'* to obtain a 'K* value rather than Table A. 6 
to obtain 'Z' value. The results of the coffg»utations are: 



CI- ■ 

X 



a-. 

X 



X Hh K 
- $746.67 t $118-72 or 
■ $865.30' to $627.95 . I 

the results can be descrtbed as follows: ^ 

"On the basis of a thr^-stud^t^lmple random sample of Artificial 
University ^06 recipients, it can be estimated, with 95 percent , 
confidence, the mean value of SEOS awards at AU falls between $627.95 and 
$865.39 with the most likely value $746.67.^* "95 percent confidence" 
means that if we were to follow the same procedure- for drawing multiple 

; thrGe-student Safflples of AU.SEflfi. recipients,. 95 perc^t of the resulting 
confidence intervals would contain the true population fnean. Tabl^ A.U 

werifies this result. Confidence Interval estimates for average SEOS 



■m 



ERJC - 



.A17 



r 



89 



awards at AU based on all possible three- studa* t saaiples at 95' percent* 



confidence are displayed, the 20 confidence intervals 19, vor 95 ' ^ 


percent contain the true i^an of $752. 




TABtE A.IO: TOIEBYSHErF'S THEOREM 




■ ' ' V. , Distance/. . - ■ ; • •, . ' * ^ • . 

■'..From- the ■ . - ■ ^ '* ■. - • 
* • ftean (Measured in . 
Percent Standard Deviations) ' ' . .. 
' of Cases (K Values) * • . ' v 


50.00 ^ 




60.00 \ 


. 1*58,; /. .... : ■ 


: 70.00 " ' 


1.83 " '\ 


80.00 , 


2.24 


90.00 


iM i ^ ; 


95.00 


■ 4.47'' ' 1^ • 


98.00 


7.07 


99.00 


* 10.00 • 


99.90 ^ 


31.62 


99.99 


100.00 



(!||INFiraCE IffTERVAL FOR THE TOTAL 

The confidence Interval for a saajple-based estimate of tffe population 
total is obtained by is simply imiltlplying the confidence' Interval for 
the mean CI '-by n, the number of cases in the populationt ^ 

I CI - » • CI- ^ • (A.20) 

• •■ - ' . * ' ' ' " i ■ 

For n s 30: Cl - N • (x + Z & =) • Nx + N Z a - 

" ; ~ ^ (/L21) 
, • N x,+ Z' Cr. • 

• -^or t^ « 30: Cr • N » (x + K ^-^ - . 

~ ^ (A.22) 

■ CIi « Nx + N • f( • 5- ■ ' ■ * 



TABIE A.ll; EXAM>lE/OF RELATION' BETiCEN Skmi CONFIDENCE INTERVAL AND 
V POPULATION MEAN {Confidence Interval estimates of average 

. .. SEO$ awards at AU ijased on three-case sanales at 35* 
confidence) ' . j? 



Sample 

CEF 
COE 
COF . 
B5F . 
flOF 

Boe 

8CF 
BCE 
SCO 
ABF " 
AOF 

Ape 

ACE 

ACD 

'A8F 

ABE 

ABO 

ABC 

AMfd 
fmunt 



\ 



Mi- 



S92 6iz 632 652 69Z 71^ 732 752 772 792 \it 



832 852 



B72' 692 



ATS 



THE RELATIOfi BETWEEN CONFi0JfNCE iktERVAL AND CONFIDENCE LEVEL 

In the' equation for oalcuUtlng confidence Inlsrvals for the Bean, 
CI=x + Z ff- (For^raula A.17), a direct relationship can be seen between 
the confidence level (as represented by 'Z') and the confidence interval 
•'(Clj,! The higher the /Z'j value the wider the 'confidence interval. This 
.is the result of the coniiionser|sicarfact that the nore certain- we wish ^ to' 
be that tiSe true population man falls sai^where fh the confidence 
interval, the wider the interval must be^ Table A. 12 illustrates the 
- relationship between conf idenie intervals -^nd confidence levels. For the 
. ef^le given in the table, .'wi^h a sample size of 250 at 'a 95 percent 
confidence level, the confidence interval -is + 6.2 percent. With a 
/ higher confidence. >level of 99 pferc^t, the confidence interval grows to + 
8.1 percent. ' j . ' 

" • ■ \ ■ • - ' . ■ ' • , ^' '■• • 

MINIMtW NECESSARY SAMPLE SIZE 

' ' ' _ . , ■ ■ ■ ■ .. . /■ 

*For a; simple raijdoro ^araple-^^^^^^ without r^placenent the formula for 

„ determf nf og 'aiinirtum necessary sample s ize is v 

r n ^ ^ 'Z ' ^ (A.23) 



Where: 



n > mirtiinum nec.essary sample size 
" / , I I valine based on desired confidence level. (See 
. Jab.leA.7 to obtain appropriate value. ) 
' E * Acceptable average level of error of- estimate (Confidence, 

Interval). • 
' cr- .*» ■Estimate of population variance. 

As an iltuSttalion, consider a quality control review of BE06 
applications,* The reviewers wished to establish, at the 90 percent 
confidence level, the average family/lhcome of applicants within a $300 
ccmf;fdenciB Jnteff^al, The -sample was to be drawn frdro a data f i le which 
contained approximately 6,000,000 applications. From previous studies, 
the reviewers est.iffiated' the family income standard deviation at $9,000. 
*Tranilatin§ ,t*iese facts into .t^e proper^ statyticai notation: 



A20 

92 



TABLE A. 12: EXAMPLES OF THE RaATIOH BETWEEN CONFIDENCE INTERVALS AND 
CONFIDENCE LEVELS ' 




0 ±1 ±2 t3 +4 +5 j« +7 +8 .-^9 +10 
: " "lb ^ - - 

Confidence Interval (hi percent Of total). 



, (Confidence level by confidence Interval for various sample sizes for 
a two-category variable with a 50/50 population distribution based on 
single rarkJom sampling with replacemait from a large population.) 



\ N ■ $,000;000. (The'nunijer of cases Irr the population sampled) 
E * W (The acceptable average level of eh'or of estimate) 
,Z a 1.65 (The Z value associated with a 903» confidence level 
"^N^ frm Table A.7) 

» (9.000)^ - ai.OOCOOO-tEstlmated variance of family 

Income) 

^. Substituting thele values Into equation A. 19, we obtain; 
' rio (6,000,000) . (1,6$)^ , (9,000)^ ^' 

{3Q0f • (6.000,000) + {r.a).(9,000^ 
- 2450 

THE RELATION BETWEEN SftHPLE SIZE AND POPULATION SIZE 

Examination of equation A, 23 for minimum necessary sample size 
reveals that there Is single, direct relation between popttlat'len size 
and necessary sample size*: Therefore, It is not possible to detenttlne' 
necessary^ample size as a sinple percent of the pooulatlon. An 
'illustration of the ccmipTex relation "b^twen^sample^nd population size Is 
contained In Table A>13. In the case illustrated^ for populations under 
3,200 cases, the necessary sample size Is directly related to population 
size. However, for large populations the required sample size is almost 
completely Independent of the size of the population. Thus, In the 
example, the sample size required for a population of 25,000 is almost 
e«ja»f-toKlg^saraple size for a population of a 1%000,000. 

*Thjs res'ult IS of great iJ^rtance. In a large population, the * 
necessafy sample size depends primarily on the variability of the 
population and only' a little an the" fraction of the population sas^le. 
Many people intuitively feel, as an example, that a. 30X simple of a . 
population of '200 would yield much more precise results than a .1* sample 
df^ population of^a million. In fact, as Table A. 13 demonstrates, the 
exiact^pposite Is true. This helps explain the gr^at costs Savings thlt 
mjiossible using sampling with a large populatloo.^ , ' 

PRA(rriCAL t'ROBLEt^ IN [5hERMININfi SAMPLE S^ ^ 

In mny practical circumstances, all ^e information needed to 
calculate iainiratrnj necessary sample size i^not readily available. 'Wh^i, 

' ' .A22 - - . 

94 



lAiU 4.13: € XWU OF SEUTIOM BCTtCEfi WItlNII IffCISSWir SAWtl SIZ£ mt PflPULATMW SI2C 



t60Dr 




<00 808 1,601 3,2SO 6,400 12,800 •2S.fi€0 S1,?00 102^400 204,800 409,«JO m9^m 



Size of Pope! atf Oft (n} 



(Required %mffU sire for d1ff(^9?t pofnistfon sizes far a tuo citesory varfsble with SO/SO |«puUt1on dt&trflwtlon. 95 percent confli^e 
•nd ♦ 3 percent omflience Interval .> , • 




< 



. 95, 



96 



1 



ERIC 



Ithe maber of cases irt the population, Is unknown, or vdien N Is very 
Urge «s In the example above, the forraula for nlnlrauw n^esSary sample 
size can be .strap 11 fled to: 

V n - ^ ; ^ V • . . ' ' (A.24) 

Returning to the previous example, ar^l substituting *n the appropriate 
values, we get: ' * * . 

' \ •? ■ (l.SS)^ ' (9,000)^ ' « 2450 . ' ' , . ' / 

(300)^ 

The solutions to necess-ary sanfle size were Identical for the two''. . ' 
formulas. The results givep by etjuatton A. 23 and equation A.24 will . 
diverge significantly only when the san«>le size 1.s i% or greater ©f the 
total population. .® . ' 

A coaujon situation Is that the variance of the population to 'be 
sampled Is unknown. When this Is th6 case, there are a variety of w^s of 
estlnatlng^^^, the population variance: ^' # ' 

, t a can be est liwted or tite basis of previous stadles or past 
experience ' ' \ ^ " . 

• A smain pilot sample can be drawn .to estimate ^ 
t If sanipling from an approximately normally distributed 
population, cr^ can be roughly estimated as 

■ ;• -h- ' \ 

♦ N 

f 

Whet^ R Is the range. The range Is the highest known value 
mlhus the lowest known valu*. 

, ; liftieri the variable being studied Is categorical, can be 
est.linated by assuming the fiKwImum variance ami setting. 5 2 
.25. . ' ^ 



' A third coninon problem in determining necessary sample size arises In 
situations where the acceptable error Is defined in terms of estimating 
the population total rather than estimating the population mean. An 
example would be to estimate total aid overpayments (made by an 
institution) rather than average Overpayment. Suppose'an auditor wanted 
to draw a* sample which would allow estimation of total $£QS overpayments 
(wjthin a margin of error. of + SlOOtOOO) at a University having T,6CH) 
SE06 recipients. To determine required sdraple size It would first be 
necessary to convert the total acceptable error Into average acceptable 
■error by dividing the tptal acceptable error ($100,000) by the nun^er of. 
xcases in the population. (N « 7,60O'S£0S recipients). 

I Thus: \ / * > 

« (Total acceptable error')/N » (Average acceptable error) 

• ' E « {$100,000)/{7;600) •$13.16 (A.25) 

The resulting value can thai be erployed In equation 4.23 to ^ 
establish minimum necessary sample size. "* 

' »■ ■ ■■• ■ 

SELF-TESTINS REVIEW 

. Based on the data in Table A. 12, complete the following exercises. 
Answers can be found on the folljowing page. 

TABLE A.14: CWS EARNINGS AT ARTIFICIAL UNIVERSITY \ 



Student 




CWS -Earnings 




A 




$ 470 




B * 




750 




C 
0 




- 120 


* 


E *, 




590 


m 

f 


F 

e » ■ 

1 


1 


600 





L: Compute the population size (N), min (u), to ta T fr), variance 
(o^), and standard deviation C^). 

2. OnJ^e basis of a s^le of studejjts C, 8, and £, calculate the 

sample mean (x)» total ("^), variance (s^) and standard deviation (s). 



A25 



On the basts of a saraplg^of students 0, E, A, C, estimate the 
population total (f ) and a confidawe Interval wtth'an perc^t " 
cpftfldence level, , 

For 4 wilversfty with 375 OIS reel ptentsr What Is thfe n«:essary sample 
size to estimate the popuVatloiv^iJ^an + $$0 at a 90^ percent confidence 
level a^uming a population staWnl deviation of N • $E94.327 



A26 



ANSWERS TO SELF-TESTING REVIEW 
1, ' H. » 6 



n'^ ^^i « 3630 • $605. 



» Sx^ » $3630 , 



& . 294.32 ^ 

2. x.« ^^i - 813.33 ' ' 

T« ^x. « 2440 
^Z^ M^i ' V m 136066.67 ^ • mOU 

s» 260.83 

3. CI^-» NX +. N • K "ff-r 

At & confidence levet of 80?f,' K«^.24 

. . ..... 

-N - 6 

X - £^ • S80 - 570 
n 



h ' • N - n 

/ n(n - 1) -HT" 

• / 49^00 . T 
J ST ^. 

■ ii?a2 

CI * 6 X 570 + 6 X 2.24 x 117.12 

• 3420 + 1574.07 . 

- $1845.93 to $4994.07 



A27 



4^ 



m 









# r ■ • 


H » 




Z * 


1.65 at SOST confidence 


E - 


60 




^^4.32 ■ ^ . 



(175^ » (1.65)^ . (294.32)^ 
(60)/^ ' (375) ♦ (1.65) ' (294.32)^ 

- 56 



t i 



'9 




0 



ERIC 



AZ8 



101 



APfCNDtS B: SMVLINe SYMBOLS m fCRMI^AS 

(For Sfff»te ft«ndo«SMp11ng Without flep1ace«ent) 



NuMber 

of 
Ct«es 



PdfHflatfo^ 
^Frecniency of 
AtlrifcHtte of 



PopiiUtfon J 
PrarHH-tttm of 
mrfbtite of 
Interest 



Pojmlfitfon 
Total 



foputatlfyo 



Populalfofi 
Ik»v<atfoii 



102 



Contlmious 




Cftlegortcal 


COHtiflUQUS 

Varfahio 


; Gat^irt^iil 

Vairl^te ^ 


II ^ 






^ «L- ■ ■ ■■ ' ' ■ 

, fl 








* * - 


* * 

f « f n/o ^ 


« 








^- i ^.^ ^ 


^ , „.i 


« 




« 




t » F 




^ - p * » 






« 


8 

A. » X 


<< 


< 




- P 


. X * tX j/fl 


X »^ f/n 


4. 








* D 


J . . 

« 










/ ~B " 

V . N , 
























• 

• 



\ 



103' 



o 

ERJC. 



BEST COPY AVMUiLE 



•{For S1ii|»t« ^mnim SmpM^is. V^thoMt ReDl«ceMnt} 



\ ' 



Standard Irror ^ 
of the Wean 



CooHdence 
t^lfaatf! and 



•liitenrtfl fir 



CfitifliteiKe 
Intefvaf fur 
, Estate of 
l^fatlOfi Total 



Continence 
ifit^nrirf for 
Cstteate of 
Populatfon Totat 



Mlofama 



taf^sat'y 
Site 



Conttnuou$ 
^ VarlAle 



erIc 10.4 



Catescrlcal 
Vartabfe 



* s 



I 



Contfnuous 
Vartsble 



(See TiMe 4.6 for 



(See T«(>te 4. to for 
. K vclttes) 



Cf 



BEST COPY AV/ttUBLE 



Virlahle 



a w « "^v^ 



CI J- X ♦ Z«- 

(See 1«^1e 4.6 fdT 
I values} 



(See TabTe 4.10 for 
K values) 



-P? 



105 





* 

V«rf Ale 


Cit«gorilij;it 
V«rf«tite 


* - 

Contfmtfius 

^ . — fi 


* 








St« 

* in i^tmm or 






V 


„ . ? d 




* 




• 






.■ .■ - 

* 


e 


•■ . ^ ■ ■ 

0 * 


1 


0 

^ ■ ■ 


t 

r 

» . 




\ 




\ 






« 




4, 










y ■ ' ' \ 


i 


♦ * 








• 






y , > 














% 

^ w 
9 


« 

i 

* -» 






4 • 

to 




- * 




1 


4 




• 

« 

• 








r * 










f 


** 












m 




. ... 


■'107 



ERIC 



BESI COPY AVMLMHl 



AWENDIX- C 



BIBLIOGRAPHY • # 



( 



There are a large nianber of books ana articles on soiipllng 
statistics; For those wishing to take the next step beyond the oaterlals 
presented in this manual i MandenhAll (1976), Stor Ira (1950^ and Samters 
(t967) offer good treataients of basl^: sampling on an ^i^ntary 
mathematical level. Stiiart (19'62) co^^ers the basic c^^pts of saapling 
In a presentation with yepy tittle mathematics. An excellent short 
pi»esentat1on of stratified and cluster sao^Tes 'Is contained In laiewltz 
(1968) . On an .Intermediate. mathematical level , lessen (1978) is a very 
useful reference. Hansen (|953)* Cochran (19S7), and Klsh (1965) are 
valuable general feopks on statistical sampling which have b6C»ne staftdltrd 
references. Volume 1i of .fiansen contains the statistical derivations of 
most coranpn formulas. For treatments-of statistical sailing \ ^ 
specifically related to the needs of auditors, see Arkin (1963) and 
Oeraing (1^60). 
9 



\ 



6I31^I0GRAFHY 



Anderson, ft, aiiiTeltebaun. A.D. "Oolljair-unlt SaropJIng" in Canadian 
Chartered Accountant . Apri 1 1973 .■ • 

Arlcin, H. 1963. Handboolt of Sampling for^udj ting and Accounting , 
McQrjw-Hltl, New Jork. — r r— 

' ' . . ■ . ' . . ■ " , 

Cochran, W.G. t967. Sampling Techniques . 3rd ed, Wiley;' New y<?rk.- 

Oeraing, W.E. 1960. SampTe design in fiuslness Research. «ite*r . 
New York. 

Hansian, M.fe; Kurwitz, W.N. ; and Madow, W.G.'^1953. Sample Survey Methods 
and Thetory . Wllty. New York. ; , , » 

Jessen, ll.J. 1978. Statistical Survey Techniques .* Wiley. New York I 

fClsh, t. 1965. Survey Sanieling . Wiley. New York. . 

Lazewltr, 8. 1968. "Saj^ltng Theory, and Procedures," in H..BJadock \ 
Methodology in Social Research . 278-328.' Wc&r«*-H1ll, New York/- " \. 

Mendenhall, W.; Ott, k. and Sceaffer.vR, 197$. Survey SanpTlng . ' 
Duxbury Press, 9elnton§t titii^ornia. ; '^ X 

SaNers/O.r Murph, A.F.V wd £ng R.J^ - Statistics; *A fresh . 

Approach. " 

SnedecorvSvW.; and Cochran, W.S. 1980. Statistical Methods . 7th ed. 
Iowa State University Press,'- Ames, Iowa. 

Storim, M.J. 1960. Stoling in a fiutshen . SloKm »^ Schuiter, 
Mew York. " , " . 

. . ■ ■ ■/ V ■ ' ■ ■ , ■ 

Stuart, A, 1962. Basic Ideas of Scientific Sampling .. Srif'fln, London. 



Attributes . 

BUs ♦ . 
Central Limit Theofeia ^ 
Cluster SarapUng 
Conf tdence Interval 
Ccjnfidence Level 
Conf t dance Ctmfts • 
Discovery Samp Ting 
Distrlbutlorv 
Dollar^Unlt Sampling 
Errdr. Sampling 
Exploratory Sampling * 
Frequencies 
Interval Estimation 
Interval SampHrig' 
Judgmental Sampling , 
Mean 

Mean Square Irror 
Minimum Necessary Sample Size 
Multi-stage S^lfng 
Multl -use Samples 
Normal Distribution 
Opportunity Sampling 
l^araraeter I 
Mint EstlBatiof* 
Population 
Precision \ 



.' A6 

13-14, A15-A16. A20-A2.1 

13-14^ 




INE^X (Conttmied) 



Quota' Sampling - 
Rand<m Number Table 
Replacement 
Sample Siie 
Sampling 

Advantages 

.Disadvantages 

Types of /• 
Sampling Error 
Sampling Unit 
Sequential Sampling 
Simple Random Sampling 
Small Samples 
Standard Deviation 
Standard Error of the Mean 
Statistics • 
Stop or Go Sampling 
Stratified Sampling • ^ 
Sy,stematic Sampling 
Tchebysheff ' s Theorem 
Variance 
2 Values 



.i^OOWMi«il«TimWTU«OmCIs lit* nit 9Sk hZhS 



